Systemd: Zero to Hero – Part 3: Creating and Customizing Unit Files

Systemd: Zero to Hero – Part 3: Creating and Customizing Unit Files

Welcome back to our systemd journey! In Parts 1 and 2, we explored the fundamentals of systemd and mastered service management with systemctl. Now we're ready to dive into the heart of systemd customization: writing your own unit files. If systemctl is your steering wheel, unit files are the engine under the hood—they define exactly how your services behave, when they start, what they depend on, and how they recover from failures.

Unit files might seem intimidating at first (they certainly did to me when I stared at my first cryptic .service file), but they follow a logical structure that becomes second nature once you understand the patterns. By the end of this post, you'll be crafting unit files like a seasoned system administrator, complete with proper dependencies, restart policies, and even scheduled tasks.

Understanding Unit File Anatomy

The Basic Structure

Every systemd unit file follows a simple INI-style format with clearly defined sections. Think of it as a recipe card for systemd—each section tells the system a different aspect of how to handle your service:

[Unit]
# Metadata and dependencies go here

[Service]
# Service-specific configuration

[Install]
# How the unit integrates with the system

The beauty of this structure is its consistency across all unit types. Whether you're writing a .service, .timer, or .socket file, the [Unit] and [Install] sections work the same way.

File Locations and Precedence

Unit files live in a hierarchy of directories, and systemd has a specific order of precedence when loading them:

  1. /etc/systemd/system/ - Administrator-created units (highest priority)
  2. /run/systemd/system/ - Runtime units
  3. /usr/lib/systemd/system/ - Package-installed units (lowest priority)

This hierarchy is brilliant because it lets you override vendor-supplied unit files without modifying the originals. Need to tweak nginx's startup behavior? Just create /etc/systemd/system/nginx.service and systemd will prefer your version.

Crafting Your First Service Unit

A Simple Web Application Service

Let's start with a practical example. Suppose you've written a Python web application that needs to run as a system service:

[Unit]
Description=My Awesome Web App
Documentation=https://github.com/mycompany/awesome-app
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=webapp
Group=webapp
WorkingDirectory=/opt/awesome-app
Environment=FLASK_ENV=production
Environment=DATABASE_URL=postgresql://localhost/myapp
ExecStart=/opt/awesome-app/venv/bin/python app.py
ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure
RestartSec=5
TimeoutStartSec=0

[Install]
WantedBy=multi-user.target

Let's break down each directive and understand what makes this unit file robust:

The [Unit] Section Deep Dive

Description provides human-readable information that appears in systemctl status output. Make it descriptive but concise—your future self will thank you when troubleshooting at 3 AM.

Documentation points to relevant documentation. This seemingly minor directive becomes invaluable during troubleshooting or when onboarding new team members.

After and Wants work together to manage dependencies. After=network-online.target ensures our service starts only after network connectivity is established. Wants=network-online.target creates a soft dependency—if the network target fails, our service will still attempt to start.

The distinction between ordering (After/Before) and dependencies (Wants/Requires) is crucial. Ordering controls when things start, while dependencies control whether they start at all.

Service Configuration Essentials

Type=simple tells systemd that our process runs in the foreground and doesn't fork. This is the most common type for modern applications. Other types include:

  • forking - Traditional daemons that fork into the background
  • oneshot - Tasks that run once and exit
  • notify - Applications that signal readiness via systemd
  • dbus - Services that register on the D-Bus

User and Group specify the security context. Never run services as root unless absolutely necessary—create dedicated service accounts instead.

WorkingDirectory sets the current directory for your process. This prevents path-related issues and ensures your application finds its configuration files.

Environment variables can be set directly in the unit file. For sensitive data or complex environments, consider using EnvironmentFile instead:

EnvironmentFile=/etc/default/awesome-app

ExecStart is the heart of your service—the command that actually runs your application. Always use absolute paths to avoid confusion.

Process Management and Reliability

Restart policies determine how systemd responds to service failures:

  • no - Never restart (default)
  • on-success - Restart only on clean exits
  • on-failure - Restart on non-zero exit codes or signals
  • on-abnormal - Restart on unexpected termination
  • always - Always restart regardless of exit reason

RestartSec adds a delay before restarting, preventing rapid restart loops that could overwhelm your system.

TimeoutStartSec controls how long systemd waits for service startup. Setting it to 0 disables the timeout—useful for services with lengthy initialization.

Advanced Service Patterns

Handling Complex Dependencies

Real-world services rarely exist in isolation. Consider a web application that depends on a database and Redis cache:

[Unit]
Description=E-commerce Web Service
After=postgresql.service redis.service network-online.target
Wants=network-online.target
Requires=postgresql.service
BindsTo=redis.service

[Service]
Type=notify
User=ecommerce
ExecStart=/opt/ecommerce/bin/server
ExecStartPre=/opt/ecommerce/bin/check-dependencies
NotifyAccess=main
Restart=on-failure
RestartSec=10
StartLimitIntervalSec=300
StartLimitBurst=5

[Install]
WantedBy=multi-user.target

Here we're using different dependency types strategically:

  • Requires creates a hard dependency on PostgreSQL—if the database fails, our service stops too
  • BindsTo creates an even stronger bond with Redis—if Redis disappears, we immediately stop
  • Wants creates a soft dependency on network connectivity

ExecStartPre runs before the main process, perfect for health checks or initialization tasks. If this command fails, the service won't start.

StartLimitIntervalSec and StartLimitBurst implement intelligent restart limiting. This configuration allows up to 5 restart attempts within 300 seconds—if we exceed this rate, systemd gives up and marks the service as failed.

Environment Management Best Practices

For production services, managing environment variables securely is crucial. Here's a robust approach:

[Service]
Type=simple
User=myapp
Group=myapp
EnvironmentFile=-/etc/default/myapp
EnvironmentFile=-/etc/myapp/local.conf
Environment=LOG_LEVEL=info
ExecStart=/opt/myapp/bin/server

The - prefix before environment files makes them optional. This pattern allows for:

  1. System-wide defaults in /etc/default/myapp
  2. Local overrides in /etc/myapp/local.conf
  3. Fallback values defined directly in the unit

Timer Units: Cron's Modern Successor

Timer units provide sophisticated scheduling capabilities that make cron look primitive. Let's create a backup system that runs daily with intelligent retry logic:

Creating a Backup Timer

First, the service that performs the actual backup:

# /etc/systemd/system/daily-backup.service
[Unit]
Description=Daily Database Backup
Documentation=file:///opt/backup/README.md
Wants=network-online.target
After=network-online.target postgresql.service

[Service]
Type=oneshot
User=backup
Group=backup
ExecStart=/opt/backup/scripts/backup-database.sh
ExecStartPre=/bin/mkdir -p /var/backups/postgresql
Environment=BACKUP_RETENTION_DAYS=30
StandardOutput=journal
StandardError=journal
TimeoutStartSec=3600

Now the timer that schedules it:

# /etc/systemd/system/daily-backup.timer
[Unit]
Description=Schedule daily database backups
Requires=daily-backup.service

[Timer]
OnCalendar=Mon..Fri *-*-* 02:00:00
Persistent=true
RandomizedDelaySec=600
AccuracySec=60

[Install]
WantedBy=timers.target

Timer Configuration Explained

OnCalendar uses systemd's powerful calendar syntax. Mon..Fri *-*-* 02:00:00 means "Monday through Friday at 2:00 AM". Other examples:

  • daily - Every day at midnight
  • weekly - Every Monday at midnight
  • *-*-1 00:00:00 - First day of every month
  • *:0/15 - Every 15 minutes

Persistent=true ensures missed runs execute when the system comes back online. If your server was down during the scheduled backup time, it'll run as soon as possible.

RandomizedDelaySec adds random delay to prevent thundering herd problems. With 600 seconds, the backup will run sometime between 2:00 AM and 2:10 AM.

AccuracySec controls timing precision. The default (1 minute) is usually fine unless you need second-level precision.

Monotonic Timers for Dynamic Scheduling

Timer units also support monotonic timers for event-based scheduling:

[Timer]
OnBootSec=15min
OnUnitActiveSec=1h
OnUnitInactiveSec=30min
  • OnBootSec - Runs 15 minutes after system boot
  • OnUnitActiveSec - Runs 1 hour after the service last started
  • OnUnitInactiveSec - Runs 30 minutes after the service last stopped

Debugging Common Unit File Gotchas

The Template Unit Pattern

Template units allow creating multiple instances from a single unit file. They're incredibly useful for managing similar services with different configurations:

# /etc/systemd/system/webapp@.service
[Unit]
Description=Web Application Instance %I
After=network.target

[Service]
Type=simple
User=webapp
WorkingDirectory=/opt/webapps/%i
ExecStart=/opt/webapps/%i/bin/server
Environment=INSTANCE_NAME=%i
PrivateTmp=true

[Install]
WantedBy=multi-user.target

You can then create instances like:

  • systemctl start webapp@staging.service
  • systemctl start webapp@production.service

The %i specifier gets replaced with the instance name (escaped), while %I provides the unescaped version.

Avoiding Dependency Cycles

One of the most frustrating issues in unit file creation is dependency cycles. These occur when units have circular dependencies:

A requires B → B requires C → C requires A

Common causes include:

  1. DefaultDependencies: Many units automatically get dependencies on basic system targets
  2. Conflicting ordering: Using both Before= and After= with the same unit
  3. Complex dependency chains: Multiple levels of dependencies that loop back

To debug dependency cycles, use:

systemd-analyze verify myservice.service
systemctl list-dependencies --all myservice.service

The fix often involves setting DefaultDependencies=no or restructuring your dependency chain.

Troubleshooting Permission Issues

Service files must be readable by systemd, but a surprising number of issues stem from incorrect permissions:

# Correct permissions for unit files
sudo chmod 644 /etc/systemd/system/myservice.service
sudo chown root:root /etc/systemd/system/myservice.service

# Always reload after changes
sudo systemctl daemon-reload

For user services, ensure the ~/.config/systemd/user/ directory and files have proper ownership.

Syntax Validation

Before deploying unit files, always validate their syntax:

sudo systemd-analyze verify /etc/systemd/system/myservice.service

This catches typos like ExectStart instead of ExecStart before they cause runtime failures.

Real-World Examples and Production Patterns

Node.js Application with Health Checks

Here's a production-ready unit for a Node.js application with comprehensive monitoring:

[Unit]
Description=Node.js API Server
Documentation=https://github.com/company/api-server
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=nodeapp
Group=nodeapp
WorkingDirectory=/opt/api-server
Environment=NODE_ENV=production
Environment=PORT=3000
EnvironmentFile=-/etc/default/api-server
ExecStart=/usr/bin/node server.js
ExecStartPre=/opt/api-server/scripts/health-check.sh
ExecReload=/bin/kill -USR2 $MAINPID
ExecStop=/bin/kill -TERM $MAINPID
Restart=on-failure
RestartSec=5
StartLimitIntervalSec=600
StartLimitBurst=3
TimeoutStartSec=60
TimeoutStopSec=30
KillMode=mixed
StandardOutput=journal
StandardError=journal
SyslogIdentifier=api-server

[Install]
WantedBy=multi-user.target

Database with Backup Integration

A PostgreSQL service with integrated backup scheduling:

# postgresql-custom.service
[Unit]
Description=PostgreSQL database server
Documentation=man:postgres(1)
After=network.target
After=syslog.target

[Service]
Type=notify
User=postgres
Group=postgres
ExecStart=/usr/bin/postgres -D /var/lib/postgres/data
ExecReload=/bin/kill -HUP $MAINPID
KillMode=mixed
KillSignal=SIGINT
TimeoutSec=0
Environment=PGDATA=/var/lib/postgres/data

[Install]
WantedBy=multi-user.target

Paired with its backup timer:

# postgresql-backup.timer
[Unit]
Description=PostgreSQL Backup Schedule
Requires=postgresql-backup.service

[Timer]
OnCalendar=*-*-* 03:00:00
Persistent=true
RandomizedDelaySec=1800

[Install]
WantedBy=timers.target

Container-Based Service

For containerized applications using Podman:

[Unit]
Description=Redis Container
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=containers
Group=containers
ExecStartPre=-/usr/bin/podman stop redis-server
ExecStartPre=-/usr/bin/podman rm redis-server
ExecStart=/usr/bin/podman run --name redis-server \
  --publish 6379:6379 \
  --volume redis-data:/data \
  redis:latest
ExecStop=/usr/bin/podman stop redis-server
ExecStopPost=/usr/bin/podman rm redis-server
Restart=on-failure
RestartSec=30
TimeoutStartSec=300

[Install]
WantedBy=multi-user.target

Testing and Validation Strategies

Unit Testing Your Unit Files

Before deploying to production, establish a testing workflow:

  1. Syntax validation: systemd-analyze verify
  2. Dependency analysis: systemctl list-dependencies
  3. Start/stop testing: Verify clean startup and shutdown
  4. Failure simulation: Test restart policies work correctly
  5. Resource verification: Confirm user permissions and file access

Staged Deployment

Use systemd's override capabilities for staged deployments:

# Create override directory
sudo mkdir -p /etc/systemd/system/myapp.service.d/

# Add staging configuration
sudo tee /etc/systemd/system/myapp.service.d/staging.conf <<EOF
[Service]
Environment=STAGE=staging
Environment=LOG_LEVEL=debug
EOF

sudo systemctl daemon-reload
sudo systemctl restart myapp.service

This pattern allows testing configuration changes without modifying the main unit file.

Looking Forward

With solid unit file skills under your belt, you're ready for Part 4, where we'll explore advanced debugging techniques. We'll dive into journalctl mastery, dependency troubleshooting, and performance analysis tools that will make you a systemd diagnostician.

Unit files are the foundation of effective systemd management—they encode your operational knowledge into declarative configuration that systemd can execute reliably. The investment in learning proper unit file syntax and patterns pays dividends in system reliability and maintainability.

In Part 5, we'll push into expert territory with security sandboxing, resource controls, and advanced systemd features that transform your services from basic process management into hardened, monitored, and resource-controlled applications.

The journey from systemd novice to expert isn't just about memorizing directives—it's about understanding the patterns and principles that make modern Linux systems robust and manageable. Every unit file you write is a small piece of infrastructure that, when done well, runs silently and reliably for years.