[CI-DWH] Requesting a New Table And Uploading Data to Cloud Insights DWH

 The on-premise NetApp OnCommand Insight (OCI) DataWarehouse often got used for relatively small little tables that could help enhance reports. And the MySQL Database backend was very easy to access in the OCI world.

The Data Warehouse still exists in NetApp Cloud Insights, but it is not so easy to access, and the end user has much less control over the MySQL instance. But you can still create new tables in dwh_custom, and it is easy to upload a CSV of data to be processed by the POST ETL script (yes, this exists in Cloud Insights.)

Step 1) Do you Want to Drop/Truncate or Update an Existing Table?

If you have an existing table, or when you have a new dataset to add to a new table, you need to decide whether you want to update the table, or replace it (drop/truncate in MySQL.)

I strongly believe that you should only have relatively small tables in the Cloud Insights Data Warehouse (dwh_custom is not meant to store tables that are many GB in size.) In which case drop/truncate is the best approach. I actually prefer drop, because you know then that the table create after the 'CREATE TABLE IF NOT EXISTS' in Step 2 is per your design.

Answer = DROP

DROP TABLE IF EXISTS dwh_custom.YOUR_TABLE

Note: You can load CSV's up to 100MB in size to Cloud Insights DWH. So you can still make substantial tables.


Step 2) Design Your Table

First step is to design your table. If you already have a pre-existing table in an on-premise MySQL instance, this is simple, you can just run -

show create table dwh_custom.YOUR_TABLE

- to get the commands needed to create the able.

You will need to edit the first line to -

CREATE TABLE IF NOT EXISTS dwh_custom.YOUR_TABLE

- before submitting the create table request to NetApp.

Note 1: I'd recommend an ID field as your first/primary field with auto increment in your table definition. If you have a clear ID field as your first field, it does mean you can delete individual rows (if you wanted that ability) using the REST API ODATA.

Note 2: I recommend not having TEXT fields (or BLOB) as Cognos can be a bit of a pain with them (you'll have to CAST them to something in the MySQL query.) So CHAR, VARCHAR, INT, FLOAT ... etcetera (google MySQL Data Types.)

Step 3) Design Your Load Data Local

Next we need to design the LOAD DATA LOCAL statement. This is fairly simple. It does of course rely on your input CSV being correctly formatted.

If you don't have column headings in your CSV:

LOAD DATA LOCAL INFILE 'MyInput.CSV'
INTO TABLE dwh_custom.YOUR_TABLE
COLUMNS TERMINATED BY ','
LINES TERMINATED BY '\r\n';

If you do have column headings in your CSV:

LOAD DATA LOCAL INFILE 'MyInput.CSV'
INTO TABLE dwh_custom.YOUR_TABLE
COLUMNS TERMINATED BY ','
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES;

Note: The name of the CSV is important as the POST ETL script will look for that every time.

Step 4) Test it Out

Before making the request, it is worth testing it out on your on-premise MySQL server. You can create like a table_2. Test the create, load, drop. If you have mysql.exe on your client, to connect to your MySQL instance, the syntax is something like this:

mysql -u{USER} -p{PASSWORD} -h{HOSTNAME} -P{PORT} --local-infile dwh_custom

Note 1: Default MySQL port is 3306, you can skip -P if using the default.

Note 2: You might have to play around a little with the LOAD DATA LOCAL INFILE to get it to work. If LINES TERMINATED BY '\r\n' does not work, try '\r' or '\n' (you might be able to work it out by unhiding all characters in your text editor.) Also: FIELDS TERMINATED BY is another way of doing COLUMNS TERMINATED BY. And you might want to add - OPTIONALLY ENCLOSED BY '"' - after the FIELDS/COLUMNS TERMINATED BY.

Step 5) Make your Request to the NetApp Cloud Insights SRE Team

Final step is to make the request to the NetApp Cloud Insights SRE Team. If you're a NetApp badged resource, you might be able to submit a NetApp Jira request yourself. Otherwise you'll need to contact your NetApp Customer Success representative.

The request will go something like.

Dear NetApp SRE Team,

Please can you add the following to the POST ETL script for tenant XXX.cloudinsights.netapp.com:

DROP TABLE IF EXISTS dwh_custom.YOUR_TABLE;

CREATE TABLE IF NOT EXISTS dwh_custom.YOUR_TABLE (
-- YOUR TABLE DEFINITION --
) ENGINE=InnoDB DEFAULT cHARSET=utf8;

LOAD DATA LOCAL INFILE 'MyInput.CSV'
INTO TABLE dwh_custom.YOUR_TABLE
COLUMNS TERMINATED BY ','
LINES TERMINATED BY '\r\n';

Kind regards....

Note: You might as well upload a sample MyInput.CSV before the change above is made, then the first time the POST ETL script runs with the change, it will pick the CSV file up and process it.

For The Future) Automating Uploading the CSV

The approach above will allow you to manually update your dwh_custom table by manually uploading a CSV via /dwh-management/upload/csvs in the Swagger documentation. For the API route, check out this article:

Comments