Managing Nodes

Components of Distributed Builds

Builds in a Distributed Builds Architecture use nodes, agents, and executors which are distinct from the Jenkins controller itself. Understanding what each of these components are is useful when managing nodes:

Jenkins controller

The Jenkins controller is the Jenkins service itself and is where Jenkins is installed. It is a webserver that also acts as a "brain" for deciding how, when and where to run tasks. Management tasks (configuration, authorization, and authentication) are executed on the controller, which serves HTTP requests. Files written when a Pipeline executes are written to the filesystem on the controller unless they are off-loaded to an artifact repository such as Nexus or Artifactory.

Nodes

Nodes are the "machines" on which build agents run. Jenkins monitors each attached node for disk space, free temp space, free swap, clock time/sync and response time. A node is taken offline if any of these values go outside the configured threshold.

The Jenkins controller itself runs on a special built-in node. It is possible to run agents and executors on this built-in node although this can degrade performance, reduce scalability of the Jenkins instance, and create serious security problems and is strongly discouraged, especially for production environments.

Agents

Agents manage the task execution on behalf of the Jenkins controller by using executors. An agent is actually a small (170KB single jar) Java client process that connects to a Jenkins controller and is assumed to be unreliable. An agent can use any operating system that supports Java. Tools required for builds and tests are installed on the node where the agent runs; they can be installed directly or in a container (Docker or Kubernetes). Each agent is effectively a process with its own PID (Process Identifier) on the host machine.

In practice, nodes and agents are essentially the same but it is good to remember that they are conceptually distinct.

Executors

An executor is a slot for execution of tasks; effectively, it is a thread in the agent. The number of executors on a node defines the number of concurrent tasks that can be executed on that node at one time. In other words, this determines the number of concurrent Pipeline stages that can execute on that node at one time.

The proper number of executors per build node must be determined based on the resources available on the node and the resources required for the workload. When determining how many executors to run on a node, consider CPU and memory requirements as well as the amount of I/O and network activity:

  • One executor per node is the safest configuration.

  • One executor per CPU core may work well if the tasks being run are small.

  • Monitor I/O performance, CPU load, memory usage, and I/O throughput carefully when running multiple executors on a node.

Creating Agents

Jenkins agents are the "workers" that perform operations requested by the Jenkins controller. The Jenkins controller administers the agents and can manage the tooling on the agents. Jenkins agents may be statically allocated or they can be dynamically allocated through systems like Kubernetes, OpenShift, Amazon EC2, Azure, Google Cloud, IBM Cloud, Oracle Cloud, and other cloud providers.

This 30 minute tutorial from Darin Pope creates a Jenkins agent and connects it to a controller.

How to create an agent node in Jenkins

Launch inbound agent via Windows Scheduler

If you are having trouble getting the inbound agent installed as a Windows service (i.e., you followed the instructions on installing the agent as a service here but it didn’t work), an alternative method of starting the service automatically when Windows starts is to use the Windows Scheduler. 

We take advantage of the Windows Scheduler’s ability to run command at system startup

  1. Configure your node to use the "Launch agents by connecting it to the master" launch method

    • Click Save

  2. Note the command required to launch the agent

    • On the new agent node’s Jenkins page, note the agent command line shown. 

      • It will be like:

java -jar agent.jar \
-jnlpUrl http://<JenkinsHostName>:8080/computer/<nodeName>/slave-agent.jnlp \
-secret <some_long_hex_string>
  1. Obtain the agent.jar file and copy it to your new Windows agent node

    • In the command line noted in the last step, the "agent.jar" is a hyperlink. Click it to download the agent.jar file.

    • Copy the agent.jar file to a permanent location on your agent machine

  2. Ensure that you have a java version available on your agent machine

  3. Run the command manually from a CMD window on your agent to confirm that it works

    • Open the CMD window

    • Run the command the one like

java -jar agent.jar -jnlpUrl \
http://<JenkinsHostName>:8080/computer/<nodeName>/slave-agent.jnlp -secret <some_long_hex_string>
  • Go back to the node’s web page in Jenkins.  If everything works then page should say "Agent is connected"

  • Stop the command (control-c)

    1. Register a new scheduled job to run the same command

  • Open "Task Scheduler" on your windows machine

    • Start → Run: task Scheduler

  • Create a basic task (Menu: Action → Create Basic Task)

    • First page of the wizard:

      • Name: Jenkins Agent

      • Description (optional)

      • Click Next

    • Next page of the wizard

      • When do you want the task to start: select "When the computer starts"

      • Click Next

    • Next page of the wizard

      • What action do you want the task to perform: select "Start a program"

      • Click Next

    • Next page of the wizard

      • Program/Script: enter "java.exe" (or the full path to your java.exe)

      • Add arguments: enter the rest of the command, like

-jar agent.jar -jnlpUrl http://<JenkinsHostName>:8080/computer/<nodeName>/slave-agent.jnlp \
-secret <some_long_hex_string>
  • eg:

-jar D:\Scripts\jenkins\agent.jar \
-jnlpUrl http://jenkinshost.example.com:8080/computer/buildNode1/slave-agent.jnlp -secret \
d6a84df1fc4f45ddc9c6ab34b08f13391983ffffffffffb3488b7d5ac77fbc7
  • Click Next

    • Next page of the wizard

  • Click the check box "Open the Properties dialog for this task when I click Finish

  • Click Finish

    • Update the task’s properties

      • On the General tab

  • Select the user to run the task as

  • Select "Run whether user is logged on or not"

    • On the settings tab

  • Uncheck "Stop the task if it runs longer than"

  • Check "Run the task as soon as possible after a scheduled start is missed"

  • Check "If the task failed, restart every: 10 minutes", and "Attempt to restart up to: 3 times"

    • Click OK

      1. Start the scheduled task and again check that the agent is connected

        • Go back to the node’s web page in Jenkins.  If everything works then page should say "Agent is connected"

Monitor and Restart Offline Agents

This script can monitor and restart offline nodes if they are not disconnected manually. It can be executed in the Jenkins Script Console or can run periodically as a Jenkins job with the Groovy plugin.

import hudson.node_monitors.*
import hudson.slaves.*
import java.util.concurrent.*

jenkins = Jenkins.instance

import javax.mail.internet.*;
import javax.mail.*
import javax.activation.*

def sendMail (agent, cause) {

 message = agent + " agent is down. Check http://JENKINS_HOSTNAME:JENKINS_PORT/computer/" + agent + "\nBecause " + cause
 subject = agent + " agent is offline"
 toAddress = "JENKINS_ADMIN@YOUR_DOMAIN"
 fromAddress = "JENKINS@YOUR_DOMAIN"
 host = "SMTP_SERVER"
 port = "SMTP_PORT"

 Properties mprops = new Properties();
 mprops.setProperty("mail.transport.protocol","smtp");
 mprops.setProperty("mail.host",host);
 mprops.setProperty("mail.smtp.port",port);

 Session lSession = Session.getDefaultInstance(mprops,null);
 MimeMessage msg = new MimeMessage(lSession);

 //tokenize out the recipients in case they came in as a list
 StringTokenizer tok = new StringTokenizer(toAddress,";");
 ArrayList emailTos = new ArrayList();
 while(tok.hasMoreElements()) {
   emailTos.add(new InternetAddress(tok.nextElement().toString()));
 }
 InternetAddress[] to = new InternetAddress[emailTos.size()];
 to = (InternetAddress[]) emailTos.toArray(to);
 msg.setRecipients(MimeMessage.RecipientType.TO,to);
 InternetAddress fromAddr = new InternetAddress(fromAddress);
 msg.setFrom(fromAddr);
 msg.setFrom(new InternetAddress(fromAddress));
 msg.setSubject(subject);
 msg.setText(message)

 Transport transporter = lSession.getTransport("smtp");
 transporter.connect();
 transporter.send(msg);
}

def getEnviron(computer) {
   def env
   def thread = Thread.start("Getting env from ${computer.name}", { env = computer.environment })
   thread.join(2000)
   if (thread.isAlive()) thread.interrupt()
   env
}

def agentAccessible(computer) {
    getEnviron(computer)?.get('PATH') != null
}

def numberOfflineNodes = 0
def numberNodes = 0
for (agent in jenkins.getNodes()) {
   def computer = agent.computer
   numberNodes ++
   println ""
   println "Checking computer ${computer.name}:"
   def isOK = (agentAccessible(computer) && !computer.offline)
   if (isOK) {
     println "\t\tOK, got PATH back from agent ${computer.name}."
     println('\tcomputer.isOffline: ' + computer.isOffline());
     println('\tcomputer.isTemporarilyOffline: ' + computer.isTemporarilyOffline());
     println('\tcomputer.getOfflineCause: ' + computer.getOfflineCause());
     println('\tcomputer.offline: ' + computer.offline);
   } else {
     numberOfflineNodes ++
     println "  ERROR: can't get PATH from agent ${computer.name}."
     println('\tcomputer.isOffline: ' + computer.isOffline());
     println('\tcomputer.isTemporarilyOffline: ' + computer.isTemporarilyOffline());
     println('\tcomputer.getOfflineCause: ' + computer.getOfflineCause());
     println('\tcomputer.offline: ' + computer.offline);
     sendMail(computer.name, computer.getOfflineCause().toString())
     if (computer.isTemporarilyOffline()) {
       if (!computer.getOfflineCause().toString().contains("Disconnected by")) {
         computer.setTemporarilyOffline(false, agent.getComputer().getOfflineCause())
       }
     } else {
         computer.connect(true)
     }
   }
 }
println ("Number of Offline Nodes: " + numberOfflineNodes)
println ("Number of Nodes: " + numberNodes)


Was this page helpful?

Please submit your feedback about this page through this quick form.

Alternatively, if you don't wish to complete the quick form, you can simply indicate if you found this page helpful?

    


See existing feedback here.