Kafka - Restart Failed Process

Options
mwmoren
mwmoren Dataiku DSS Core Designer, Registered Posts: 5

I get random errors on my Kafka due to GCS bucket failures and Bigquery size limits. I'm working with my teams to resolve, but I'm wanting to know if there is an easy way to restart a continuous process in the event of a failure?

I thought about setting a scenario to start the process every 30 minutes or so, but I'm sure it will fail most of the time if the process is already running. Any ideas if what I am needing is easily doable?

Thanks in advance!


Operating system used: Windows

Best Answer

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,717 Neuron
    edited 4:34PM Answer ✓
    Options

    You can use the Dataiku API to check which continuous recipes are running and start them if needed. Below is a code snippet. You could create a new continuous recipe that monitors your other continuous recipes and starts them as needed.

    import dataiku
    
    client = dataiku.api_client()
    project = client.get_project('some project key with continuous activities')
    continuous_activities = project.list_continuous_activities()
    
    for recipe in continuous_activities:
        recipe_running = str(recipe.get_status()['mainLoopState']['futureInfo']['alive'])
        print(str(recipe.recipe_id) + " - Running: " + recipe_running)
        if recipe_running == "False":
            recipe.start()

Setup Info
    Tags
      Help me…