Scaling Horizontally & Resource management with Kubernetes
Resource management & Scaling horizontally with Kubernetes Given that activities within the same job run as threads within the same JEK process. How do you size the Kubernetes pods accordingly? A job may contain sections where 20x activities can run concurrently or may be completely sequential only using 1 core.
a) Given the above I don’t believe there is a way to scale activities within a job horizontally with more than 1 pod right?
b) Does it size the pods accordingly depending on max thread parallelism?
c) Do the pods scale vertically depending on CPU / Mem demand as more activities start running?
d) Can the user to define how many cores and mem each scenario / job requires?
e) Or is it a naïve fixed pre-set amount of CPU and Mem every pod is allocated?
Answers
-
Hi,
While each activity of a job runs as a thread in a single JEK process that runs the whole job, in many cases, the JEK will be a simple orchestrator, and everything the activity thread in the JEK does is to wait for something else to complete the computation.
Each activity within a job runs independently from the others. Each may push down computation to a different kind of system:
* To a SQL database (SQL recipes, SQL engine for visual recipes)
* To a single pod on Kubernetes (Python and R recipes, most plugin recipes, some ML recipes)
* To multiple pods on Kubernetes (Spark code recipes, Spark engine for visual recipes, some ML recipes)So within a single job, if multiple activities run concurrently (because they don't depend on one another), you may have anywhere between 0 and many pods, as each activity will (independently from the others) spawn 0, 1, or many pods.
So to summarize:
a) Yes, multiple pods may be in use for scaling activities. In particular, if you for example have multiple Python recipes that execute concurrently, each will use a pod, and you will therefore get full parallelism across multiple pods
b) The sizing of each pod is only driven by the recipe configuration. What other activities may be running at the same time is not taken into account. For example, for a Python or R recipe running on Kubernetes, the sizing of the pod running it comes from the "Containerized execution configuration selected", and can include Kubernetes requests and limits (which allows some level of dynamic sizing)
c) The request and limit of each pod is determined at pod startup - this is how Kubernetes works. The fact that other activities start has no impact. If the cluster is full, the pods for the new activities will automatically be queued by Kubernetes (and, depending on the configuration, cluster autoscaling may occur)
d) The admin and user can define how many cores and memory each configuration, and hence each instance of an activity using this configuration uses. In addition, using Resource Quotas, the admin can put limits on how many total cores and memory each user or each project can use. If these quotas are reached, Kubernetes will queue further requests
e) Each pod is allocated a pre-set "minimum guaranteed" (request) and "maximum usable" (limit) amount of CPU and memory, as this is how Kubernetes works. It's important to note that CPU and memory don't work the same at all: if CPU limit is exceeded, the activity simply throttles, as CPU is shareable. If memory limit is exceeded, Kubernetes aborts the activity, as memory is not shareable.
Hoping that this helps.
-
thanks @Clément_Stenac
, extremely helpful!A) To check my understanding. without kubernetes, each instance of an activity would run as a separate thread in the same JEK process per job right?
and if that's correct then with kubernetes enabled, each of those threads will be a pod instead? (unless the activity is deemed trivial)
so even non code visual recipes, partitions, parallel slices of machine learning algos will each get a pod?
As a collolory of this,
Assuming:i. I don't have access to a data warehouse or spark cluster. Just a standard RDS instance, that makes it harder to scale out horizontally on the backing store and an incentive to perform workloads in memory in pods.ii. I'd also like to utilize the visual recipes where possible instead of python recipes. However if visual recipes force me to save and read each intermediate dataset back into the backing store that would significantly slow down my data pipeline. (Comparable to how spark in memory datasets are so much more efficient than hadoops orignal map reduce)Are there plans to chain visual recipes in memory like SQL pipelines without needing to io to backing store? If not what is the recommendation for scaling horizontally under my constrains above?