Python Trino client / PyStarburst returns 1000 rows only

Hi ,

Currently exploring Starburst features and some usecases.

I have tried Python clients - Trino and Pystarburst , was able to establish connection to catalog and schemas. On running the query in both the Python clients, the query results are the first 1000 rows of data only. Is there any admin account level configuration change needed for this?
I have seen the Galaxy UI has limit on 1000 rows by default.
If anyone here has already faced the issue or resolved, please share your solution. Greatly appreciate your time and help !
Thanks

1 Like

I ran the following with PyStarburst and got all 1279 rows of the astronauts table in Galaxy.

a = session.table("sample.demo.astronauts")
a.select("id", "name").show(2000)

Can you share the PyStarburst code that is limiting you to 1000?

On the Starburst Galaxy Web UI, you can toggle the green “run” button to show an option that will run and download the full result set.

2 Likes

Client tools for Starburst and Trino (see Trino | Clients) including Python-based ones like Apache Superset or PyStarburst all receive the data as a user requests them and pages through. There is no limitation on the number of rows returned.

The 1000 record limitation is a feature of the Starburst Galaxy UI only. Other client tools very often also insert their own limits. I am not aware of a default limit value on PyStarburst however.

Update: Turns out that there is a default record number for the show() statement from data frames. The default value however is 10 … not 1000.

See Dataframe — PyStarburst

2 Likes

@lester @simpligility Thanks for responding with useful information. Appreciate your help

It was a silly mistake on my end where the sample catalog I was using has just 1000 rows across all its tables. I have tried queries in a different catalog and they run fine.

The Trino Python client is slower than PyStarburst as per my observation

1 Like