Sunday, 15 May 2022

[CI][DWH][ODATA] Limits $top $skip and $count

Carrying on from the previous post:

Something important I should mention about running queries against the NetApp Cloud Insights Data Warehouse is that there are certain limits on the amount of data that will be returned.

  • The maximum limit is 10'000 rows by default.
  • Without limit, the query will return 1'000 rows.
  • You can use $count to count the number of rows.
  • You can use $top to select the number of rows you want (remembering the 10'000 row limit)
  • You can use $skip to skip rows you don't want.

Playing around with Python and the dwh_inventory.hosts table (which is quite big in the environment I am running these queries on). These were my findings.

I couldn't get $count to work without returning data too. But what I could do is a count with just return 1 result.

Note: In the below I am using Python shell one liners (because it is super easy to up arrow and edit the one line of Python when you're playing around.)

Example: Getting the full count with just one row of data returned.

api = "dwh-management/odata/dwh_inventory/host?&count=true&$top=1";url = baseUrl + api;jsondata = requests.get(url, headers=headers, verify=False).json();jsondata['@odata.count']

In my case this returned 40929.
So if we want to get all the output then we need to do the following.

Example: Getting all 40929 results

hosts = []

api = "dwh-management/odata/dwh_inventory/host?&top=10000"
url = baseUrl + api
jsondata = requests.get(url, headers=headers, verify=False).json()
hosts += jsondata['value']

api = "dwh-management/odata/dwh_inventory/host?&top=10000&skip=10000"
url = baseUrl + api
jsondata = requests.get(url, headers=headers, verify=False).json()
hosts += jsondata['value']

api = "dwh-management/odata/dwh_inventory/host?&top=10000&skip=20000"
url = baseUrl + api
jsondata = requests.get(url, headers=headers, verify=False).json()
hosts += jsondata['value']

api = "dwh-management/odata/dwh_inventory/host?&top=10000&skip=30000"
url = baseUrl + api
jsondata = requests.get(url, headers=headers, verify=False).json()
hosts += jsondata['value']

api = "dwh-management/odata/dwh_inventory/host?&top=10000&skip=40000"
url = baseUrl + api
jsondata = requests.get(url, headers=headers, verify=False).json()
hosts += jsondata['value']

len(hosts)

At least that is how it should be working. I'm not sure if I was having weird issues with my Python/VDI, but what worked at one point, stopped working later on...

Note: Programmatically you would use a for loop for this, just demonstrating the top and skip here.

No comments:

Post a Comment

Note: only a member of this blog may post a comment.