Your feedback is highly appreciated! Let's make Vertabelo even better.
need help creating a data model
I've 2 csv files that will be provided to me every month. Both these files have the following fields: restaurant name, address, city & cuisine. These 2 files have a discrepancy in the restaurant name field, for ex: in the first csv file, the restaurant name is bel-air hotel & in the 2nd csv file it's hotel bel-air. However, the address & city field is the same in both the files which are 701 stone canyon rd. & bel air respectively. I need to do the following:
1) Combine these 2 data sources & ensure it produces accurate information- I was thinking to create a primary key such as restaurant_id & have a different table that would hold the restaurant information such as the restaraunt_id, name, address & city. Also, the other table would have the restaraut_id as foreign key & cuisine. Does this design make sense? If yes then I was thinking to dump these 2 files into a storage service such as Amazon s3 & then create a sql script that would copy the data from the s3 location to the db tables on the redshift followed by scheduling an ETL update job on a monthly time cadence. I can always directly import the csv files to OLAP cube & do all the ETL cleaning, curation, massaging, transformation & validation over there as well but just wondering how would I perform the data cleaning in the latter case
2) Resolve the discrepancy between the 2 data sources- Since I've created a separate table for the restaurant information, the discrepancy would cease to exists but I would like to know your thoughts about it. Is there any other approach to this?
3) Frame excellent questions for the data owners/ business owners-
The questions would cater around the 7Ws - who, what, when, where, why, how & how many. These seem to be the dimension tables but I don't seem to have an idea about what the fact table would be as there isn't any business event that's been mentioned to me. Can you please shed some light on this front as well?
4) The result of this analysis would be used as a feedback loop for the data owners to correct their source data. Build a report by listing all the columns & their definition as a feedback loop for data owners to fix the data errors in the source systems for continuous data improvements
Can you please help me on how do I get started with respect to the questions above? Appreciate all your help & thank you in advancerestaraunt 1
restaraunt2
Artificial primary keys in many-to-many relationships
Checking your article "A Dating App Data Model" I saw that in relations that appear from a many-to-many relationship (for instance between relations "user_account" and "gender") you add an artificial primary key (in the case exposed, primary key is a numeric field "id" but the primary key should be the combination of "user_account_id" and "gender_id").
Is there a technical reason to do that?
Support for Memsql DB
Is there any plan to provide support for MemSQL in near future?
Hello Deepak,
We have no dedicated support for MemSQL DB, but this database is wire-compatible with MySQL. I'm guessing then that maybe MySQL DDL script will work. However, I haven't tested it, so treat this only as a clue.
We will definitely investigate this further.
Regards,
Luke
can't change database engine from postgres to another
it is not possible to create or change the database engine from postgres to another database
Hello,
It should be possible.
1. Please open the model.
2. In the "General" section there is a button to change the database engine, please click on it.
3. Select database.
4. Save changes.
Hope this helps,
Adam
Diagram User Interface Improvements
I have been using Vertabelo for several months now, and have a few suggestions for improving the diagram interface:
- Allow the diagram canvas to be resized - When I reverse engineered my data model, many tables were imported outside the canvas area and were hard to bring in (I had to zoom in, go to the canvas edge, then zoom out in order to see them). Once organized, the model is smaller than the canvas area, so I would like to decrease it for zoom and printing ease.
- Increase the zoom out maximum from 25% - It is hard to navigate throughout the diagram since the screen cannot zoom out to display it in its entirety. This is especially problematic since the display area doesn't scroll when moving objects.
- Enable automatic scroll when moving objects - When an object is dragged to the edge of the display area, the display area should shift so the object can continue to move. As it is, the user has to click and drag an object to the edge of the display, click the model to move the display area, and repeat as needed.
- Increase size of Area resize selection box - Areas can be resized by clicking and dragging the corner box. The corner box size is too small reliably click when zoomed out, which makes it difficult to resize on larger scales.
Pasting in column name field should paste text and not insert the diagram element.
Current behavior: When pasting text in column name field diagram element is inserted.
Expected behavior: When pasting text in column name field text from clipboard should be inserted in the field.
[Postgresql] Will Vertabelo be updating to newer versions?
Just wondering if you'll be updating scheme creation with 10.x and 11.x features in the future.
Specifically I'd like to create auto-incrementing identity fields.
For example...
id int GENERATED BY DEFAULT AS IDENTITY
But they aren't a part of the postgresql version currently available in Vertabelo.
Put sequence generation on the top of the generated SQL script instead of bottom
Generated SQL scripts usually creates tables before sequences. For this reason model deployment fails due to the missing references in case of PostgreSQL.
Customer support service by UserEcho