Monday API: ETL 504 Gateway Timeout

Hello Guys!

I’m a Data Engineer and while developing an ETL that automates the ingestion of a few boards to a S3 Bucket (in parquet) I’m running into 504 Gateway Timeout issues while consuming the Monday GraphQL API.

This is the query that i’m currently running, and the self.board_id is the id of the board:

query {
    boards (ids: """+self.board_id+"""  ){
        name
        description            
        columns{
            title
        }
        items{
        id
        name 
        column_values{
            id
            text
        }
        created_at
        state
        updated_at
        }
    }
}

While trying to implement pagination into the query above (to avoid the query timeout issue) I ran into the other issue that I don’t have an automatic way of knowing how many pages I’ll need to extract for the few boards that I’m consuming.

Also, this error (504 Gateway Timeout) doesn’t happen everyday, it usually happens a few times a week, but works on the other days.

Be aware that I’m running this query within an Airflow Custom Operator (python 3.6) inside a Kubernetes Pod everyday at 4AM (UTC -3), and I’m connecting to The GraphQL API with the GraphQLClient.

If anyone has any idea of how to implement the pagination/avoid the timeout please share :smiley:

Hello @RomuloSchiavon and welcome to the community!

I hope you like it here :muscle:

It is likely that this timeout error you are seeing is related to a query trying to retrieve a big amount of data and not being able to do it in 60 seconds.

You can use pagination and check how many items you retrieve in each response from our server. When the response gives you less items than the limit you chose, you can stop the iteration.

I hope that helps!

Cheers,
Matias

Hello Matias, thank you :smiley:

Thanks for the solution, simple but efficient hahaha.

Cheers mate

1 Like

Glad to help @RomuloSchiavon !

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.