LLM Integration for Internal Tools & SaaS Products (2026 Strategy Guide)

By 2026, AI software development with a native LLM layer is not an extra feature anymore- it is the standard requirement. In fact, LLM integration for SaaS has become the standard for modern platforms. If business software can not learn, adapt, or automate on its own, it is already outdated. Whether teams are automating tedious tasks within the organization or turning SaaS into something that thinks for itself depends on how closely the AI is linked to data and how the team works. 

Honestly, the pace of AI software development has been unpredictable. What was experimental just a few years back is now completely normal. All organizations, from scrappy startups to large enterprises, are integrating LLMs right into their SaaS application development pipelines. And it is not just about adding a chatbot on top. The real shift? AI is becoming embedded in the core of products, reshaping how work gets done.

What’s pushing this change? Three big things:

  • People want scalable software solutions that respond instantly to users’ actions.
  • AI‑powered business intelligence (BI) is not just about dashboards anymore- it is about getting real answers, in plain language, from the data.
  • Companies care more than ever about privacy‑first AI software development and compliance, whether it is GDPR, SOC 2, or the new AI-related rules.

By 2026, skipping LLM integration is a sure way to fall behind. Competitors are already building with AI in mind from the very beginning. The strategy guide has really got better, too. Now businesses have everything, ranging from machine learning to smart ways to keep SaaS data separate for different customers. It is not guesswork anymore- it is a repeatable, scalable framework. If a business doesn’t adapt, it risks being left behind.

LLM Adoption Curve (2020-2026)

Internal Tools vs. SaaS Products- Different Goals, Different Architectures

By 2026, companies won’t be debating whether to use AI anymore. The real question is how much of their systems should rely on it.

🔗 Gartner actually predicts that over 80% of enterprises will have generative AI running in production by then. That’s a massive jump from less than 5% just a few years ago. 

It is a big shift, and it highlights that building internal AI tools is a totally different game from SaaS application development.

Comparison Table

FeatureInternal AI ToolsAI‑Powered SaaS Products
Primary GoalEngineering productivity & operational ROIUser retention & market differentiation
Data SourcePrivate knowledge bases (Slack, Jira, Wikis)User‑generated data & behavioral logs
Compliance FocusSOC2, internal privacy, data leaksGDPR‑compliant AI, multi‑tenancy isolation
InterfaceSlackbots, internal dashboards, CLIConversational UI, embedded copilots
Integration StylePoint solutions for specific workflowsDeep LLM integration for SaaS across product layers
ScalabilityLimited to team or department useDesigned as scalable software solutions for thousands of users
AI Software Development ApproachFocused on automating repetitive internal tasksBuilt for AI‑powered business intelligence (BI) and personalization
Privacy StrategyControlled access within the companyPrivacy‑first AI software development with anonymization and tenant isolation
MaintenanceManaged by internal IT or engineering teamsContinuous updates through SaaS release cycles
User ExperienceFunctional, task‑drivenAdaptive, proactive, and customer‑centric

AI-Powered Internal Tools for Smarter Workflows

Internal tools are all about making work smoother and faster. With AI, that usually means assistants that summarize meetings, draft documents, or help engineers find information without having to look all over. The goal is to focus on ROI and efficiency, not market dominance.

SaaS Application Development with Embedded AI Layers

SaaS platforms have a different mission. They need to build scalable software solutions and keep users coming back. Here, AI gets right into the workflow- LLMs offer smart suggestions, guide new users, and AI‑powered business intelligence (BI) features that actually make sense of data. This is where SaaS application development no longer just integrates chatbots but starts to feel truly AI-native.

Compliance & Privacy‑First AI Software Development

Compliance matters everywhere. Internal teams worry about leaks and passing SOC2 audits. SaaS providers deal with even tougher requirements- GDPR, privacy across lots of customers, the works. The answer? Develop privacy‑first AI software. Anonymize sensitive data before it reaches an external model. That builds trust and keeps everything on the right side of the rules.

Transforming Internal Workflows with AI Agents

The Death of Search, The Rise of Retrieval

Search is going out of use. Retrieval is taking over. Instead of forcing employees to scroll through endless wikis, Slack threads, or Jira tickets, AI steps in with Retrieval‑Augmented Generation (RAG). These days, individuals only need to ask a query, and the AI will find the appropriate information and provide a concise response.

✅️ Example: A developer asks, “What’s the latest update on the payment API?” No digging through Jira. The AI finds the right entries and gives a clear update. It seems small, but over time it saves hours.

RAG Workflow: Search RAG AI Assistant

Automating the Boring Stuff

AI agents shine when it comes to routine tasks. They can:

  • Summarize meetings and automatically send out notes.
  • Turn chat discussions into Jira tickets.
  • Generate code documentation automatically.

✅️ Example: The AI generates Jira tickets, assigns tasks, and gives a summary after planning the sprint. Engineers skip the admin work and get back to actual engineering.

Engineering Productivity Measurement

Teams are not just guessing about the impact of AI- they track it:

  • Discovery time drops. Developers find what they need faster.
  • Developer satisfaction goes up. AI tools smooth out daily work.
  • Routine tasks get done way faster.

✅️ Example: After rolling out RAG-based tools, a company saw developers spend 40% less time searching for documentation.

🔗 According to a McKinsey study, generative AI can boost the global economy by $2.6 to $4.4 trillion every year, just by making business functions more productive.

AI‑Native SaaS Application Development: Beyond the Chatbox

Most SaaS platforms started with simple chatbots or basic support features. But AI‑native SaaS changes the approach. Instead of adding AI later, it is built into the product’s core. Workflows shift in real time. Insights emerge before even asking. Personalization just happens- without having to do a thing.

Embedded Intelligence for Scalable Software Solutions

Forget sitting around waiting for users to type into a help chat. Now, AI takes the lead. In a project management tool, it might spot a stuck task and remind the user of the next steps. A CRM identifies leads that are being overlooked. 

  • From sidebar chat → Proactive workflow suggestions.
  • Intelligence is not simply added like a secondary consideration- it is built in from the start.
  • And because of that, these tools scale easily to thousands of users. No fuss, no endless setup.

AI‑Powered Business Intelligence (BI) in Saas Platforms

BI dashboards are not just about flashy graphs anymore. AI steps in and explains what those trends actually mean, points out unusual spikes, and even recommends the next move- entirely in simple terms.

  • Instead of complicated visuals, teams get clear reports.
  • Insights feel personal, tailored to each person’s role. 
  • Best of all, teams make faster decisions without waiting for a data analyst to translate the numbers.

Hyper‑Personalization through Privacy‑First AI Software Development

Personalization used to mean just showing the right product. Now, AI-native SaaS is shaped by what each user really wants, all while keeping privacy front and center. 

  • Onboarding paths change instantly as users explore.
  • Recommendations feel beneficial rather than enforced.
  • With privacy‑first AI, teams keep trust and compliance.

Why It Matters

AI-native SaaS is not about eye-catching new features. It is about building real intelligence right into the product, so people waste less time clicking around and get more value from the start. When it is done right, it scales up, protects privacy, and turns software into something that feels less like a tool and more like a true partner.

Technical Implementation- Machine Learning with Clojure(The Flexiana Approach)

Why Clojure Works So Well for LLM Orchestration

At Flexiana, Clojure is the backbone of our AI systems. Its functional style and immutability keep code stable and predictable, even as systems grow. That is a big deal when companies are trying to keep orchestration layers simple to maintain and scalable. 

  • Immutability keeps data consistent across pipelines. In practice, that means fewer weird side effects and reliable results.
  • The REPL-driven workflow is a lifesaver, too. Developers tweak prompts and models instantly. No waiting- just fast feedback and quick fixes.
  • Flexiana relies on Clojure’s strengths for LLM orchestration. We build clean, functional pipelines to handle model calls, manage responses, and plug into other systems. No extra complexity. 

Clojure code for LLM orchestration.

What this orchestration does

  1. It takes the input and builds the prompt.
  2. It calls the LLM API.
  3. It processes the response to pull out what matters.
  4. It wraps all of that into one neat orchestration function.

Model Selection Strategy for AI‑Powered Business Intelligence (BI)

Flexiana’s model selection is not about chasing the latest and greatest. We keep it practical- balancing expenses, efficiency, and the specific job requirements.

  • For heavyweight analysis or deeper reasoning, we use cutting-edge frontier models like GPT-5.3, or Opus 4.6 (as of March 24, 2026). These models dig deep and extract more valuable insights, but they do cost more.
  • For daily BI work- routine questions, dashboards, lightweight reports- we go with smaller models, especially Sonnet 4.6 . These run faster and are affordable.
  • Most of the time, Flexiana mixes both. Frontier models handle the big, high-value analysis. Everyday tasks are handled by smaller models, allowing solutions to grow without wasting money.

Cost vs. performance table comparing frontier vs. small models vs. Hybrid strategy

ApproachCost LevelPerformance LevelBest Use CasesTrade‑offs
Frontier ModelsHighVery HighComplex analysis, deep reasoning, nuanced BIExpensive, slower response times
Small ModelsLowModerateRoutine queries, dashboards, lightweight reportsLess accurate on complex tasks
Hybrid StrategyBalancedAdaptiveMix of high‑value analysis + everyday reportingRequires orchestration, but is cost‑efficient

Why Flexiana’s Approach Stands Out

Flexiana actually cares about building systems that work- real solutions for real problems. We use Clojure and smart model selection to build BI tools that not only work on day one but also keep up as the business grows. Companies get valuable insights, efficient use of their resources, and a configuration that works well.

Cracking the Multi‑Tenant AI Puzzle in SaaS Application Development

Let’s be real: integrating AI with a SaaS platform is no simple task. Multi-tenant systems need to balance many customers at once, all while maintaining high performance, strong privacy, and unbreakable security. Flexiana focuses on what truly matters. 

❶ Data Isolation

When teams have numerous tenants, they can not mess around with data separation. Every customer’s info has to stay private – no exceptions, no accidental crossovers. 

Flexiana draws clear lines from the database all the way up to the AI layer. Strong tenant boundaries, workflows that keep data in place, and pipelines that scale without losing trust. Customers are assured that their data remains secure even as the system expands.

❷ Prompt Injection Defense

Large language models are powerful, but not flawless. Malicious users sometimes trick models into breaking rules or revealing hidden info.

Flexiana blocks them at the checkpoint, with built-in filters that detect suspicious input, validation layers that enforce safe responses, and monitoring that detects emerging tactics. With these protections, users do not have to worry about AI misuse.

Privacy‑First AI Software Development for Multi‑Tenant SaaS

Flexiana does not add privacy as an afterthought- we integrate it from the very beginning. Every feature, every layer, follows strict privacy standards and keeps tenant data confidential. We stick to EU GDPR guidelines and give customers real control over their info, keeping everything transparent. This way, the AI is not just smart; it is responsible.

Why It All Matters

Trust is everything in multi-tenant SaaS. Flexiana’s focus-tight data isolation, strong defenses, and a privacy-first mindset-means our AI systems stay secure, scale up easily, and follow the guidelines. That is how we build something customers can actually rely on.

Measuring the ROI of LLM Integration in AI Software Development

Bringing large language models (LLMs) into business software is neither inexpensive nor fast. Businesses want to know if it is actually worth the effort. ROI is not just about saving money. It is about moving faster, getting people on board, and making things run smoother. At Flexiana, we break it down into three main areas.

Internal Tools

LLMs can take a lot of the pain out of daily work. Companies see the benefits when teams solve problems faster and feel like they actually have the right tools.

  • Time to Resolution: Track how long it takes to fix issues, before and after adding AI. If tickets used to run for two hours and now get wrapped up in thirty minutes, that is real progress.
  • Employee Satisfaction: Just ask the teams. Are these tools helping? Simple surveys or regular feedback can help to identify if AI really makes their work easier.

These figures demonstrate whether AI is indeed simplifying tasks rather than adding more processes.

SaaS Products

For customer‑facing platforms, ROI comes from how much people use the new features and how much less support they need.

  • Feature Adoption: Check how often customers use the AI features. If people love them and use them a lot, the company knows they are useful and easy to figure out.
  • Support Ticket Reduction: Monitor support ticket volume. If customers need less help because AI guides them correctly, everyone wins. Less support means lower costs and happier users.

This helps companies see whether AI is actually improving their products and removing obstacles to progress.

Cost Optimization                                                                                                                      

Behind the scenes, businesses have to make smart choices, since running LLMs is not free. There is a clear difference between using external APIs and running smaller models in‑house. 

An API may seem low-cost at a few cents per request, but costs rise quickly. If demand rises, switching to a local quantized model saves money over time. It is all about finding that right balance between staying flexible and saving in the long run. An ROI calculator helps with that.

An ROI calculator 

Maximizing ROI with AI in Customer Support

Why This Matters

ROI isn’t just a box to tick to prove AI is worth it. It is about making better decisions as your business grows. When companies track things like internal efficiency, how customers are using the product, and what it costs to keep everything running, they actually see where LLMs make a difference- and where they need to make changes.

❓What People Often Ask (FAQs) 

Q1: Will integrating an LLM make my SaaS too expensive?

Not always. APIs are easy to set up, but costs rise as usage grows. Running smaller models yourself takes more work at first, but you end up saving money in the long run.

Q2: How does privacy‑first AI software development prevent hallucinations?

It limits how much data the AI sees and puts safety checks in place. That reduces mistakes, keeps data safe, and supports compliance. Plus, it builds trust.

Q3: Do I need a dedicated AI software development team?

If you want to move fast and handle growth, a team helps a lot. When you are just getting started, you can stick with APIs or managed services- they get the job done. Once your SaaS starts to expand quickly, having real experts on board makes everything run more smoothly and improves what you deliver. 

Q4: What does AI‑powered business intelligence (BI) do for SaaS?

It analyzes consumer data and identifies patterns. Then it gives guidance on shaping your product.BI takes all that raw information and turns it into something you can actually use, making your platform smarter and more useful.

Q5: How do scalable software solutions support AI integration?

They let you handle more users and data without slowing down. When you add more AI features, your system stays fast, and costs remain controlled.

Q6: Can I use machine learning with Clojure for SaaS AI?

Definitely. Clojure’s concurrency capabilities and design make it a good option for machine learning pipelines. It helps you add AI features that are reliable and easy to maintain.

Here’s The Bottom Line 

If companies are building SaaS applications, LLM integration is not just a nice-to-have anymore- it is expected. Teams have two main paths. They can plug in external APIs for a faster launch, or can run smaller models in-house if they want more control. It really depends on what they want to invest in, how big they want to grow, and how closely they need to monitor things.

Sticking to privacy‑first design and building software that scales- this is what keeps the business platform solid. When teams follow smart AI development practices, customers can actually trust what they see. AI-powered business intelligence is not just a set of buzzwords, either. It gives teams a clear view of customer behavior, helps them spot trends before everyone else, and guides product decisions with real data. And if companies are working on something more advanced, tools like machine learning with Clojure make it possible to build pipelines that don’t break down and are pretty straightforward to maintain.

At the end of the day, integrating AI is not about chasing trends. It is about making SaaS tools that actually work and scale with business goals.

Want to see what that looks like for your business? Book a consultation with Flexiana, and let’s figure out how LLMs can shape your SaaS strategy.

The post LLM Integration for Internal Tools & SaaS Products (2026 Strategy Guide) appeared first on Flexiana.

Permalink

From Functions to Data - Evolving a Pull-Pattern API

Context

We built our pull-pattern API on lasagna-pull, a library designed by Robert Luo that lets clients send EDN patterns to describe what data they want. The core pattern-matching engine is solid. But as we added more resources, roles, and mutation types, we wanted a different model for how patterns interact with data sources and side effects. This article is about the design decisions behind lasagna-pattern, the successor stack that replaces the function-calling handler layer while building on the same pull-pattern ideas.

For context on what the new architecture looks like, see Building a Pure Data API with Lasagna Pattern. For the monorepo structure that hosts the libraries, see Clojure Monorepo with Babashka.

The old model: patterns call functions

In lasagna-pull, the core mechanism was :with. Patterns contained function calls: (list :key :with [args]) told the engine to look up :key in a data map, call the function stored there, and pass it args. Functions returned {:response ... :effects ...}.

Here is what a few common operations looked like.

List all dashboards (read):

{:dashboards
 {(list :role/user :with [])
  {(list :self :with []) [{:title '? :id '?}]}}}

The outer :with checked authorization. The inner :with called a function to list all entries. The vector with map shape [{:title '? :id '?}] described which fields to return.

Read by ID:

{:dashboards
 {(list :role/user :with [])
  {(list :dashboard :with [{:id 123} :read])
   {:title '? :content '?}}}}

The :read action dispatched inside the function via case.

Create:

{:dashboards
 {(list :role/user :with [])
  {(list :dashboard :with [{:title "New" :content "..."} :save])
   {:id '? :title '?}}}}

Same function, different action. The function returned {:response data :effects {:rama {...}}}, and a separate executor ran the side effects after the pattern resolved.

On the server, the data map was a nested structure of functions:

(defn pullable-data [session]
  {:dashboards
   {:role/user (with-role session :user
     (fn []
       {:dashboard (fn [data action]
                     (case action
                       :read   {:response (get-dash (:id data))}
                       :save   {:response data
                                :effects  {:rama {...}}}
                       :delete {:response true
                                :effects  {:rama {...}}}))}))}})

Authorization was a function wrapper: with-role took a session, a role keyword, and a thunk that returned the data map. If the role was missing, the thunk never ran.

The saturn handler: pure by design

This architecture had a name: the "saturn handler" pattern, designed by Robert Luo. The idea was to split request handling into three stages:

  1. Injectors provided dependencies (DB snapshot) to the request
  2. Saturn handler (purely functional) ran the pull pattern, accumulated {:response, :effects-desc, :session} with zero side effects
  3. Executors took the effects descriptions and actually ran them (DB transactions, cache invalidation)

The context-of mechanism coordinated accumulation during pattern resolution. A modifier function extracted :response, :effects, and :session from each operation result. A finalizer attached the accumulated effects and session updates to the final result. The handler itself never touched the database for writes.

;; Saturn handler: purely functional, no side effects
(defn saturn-handler [{:keys [db session] :as req}]
  (let [pattern (extract-pattern req)
        data    (pullable-data db session)
        result  (pull/with-data-schema schema (mk-query pattern data))]
    {:response     ('&? result)
     :effects-desc (:context/effects result)
     :session      (merge session (:context/sessions result))}))

This was a clean separation. The saturn handler was fully testable with no mocks. Effects were pure data descriptions. The executor was the only impure component, and it was small. The original implementation is documented in the archived flybot.sg repository.

What pushed us to redesign

The saturn handler separation was elegant, but as the system grew, specific limitations emerged.

Response before effects. The saturn handler computed :response before the executor ran :effects. This worked when the response data was already known (e.g., returning the input entity on create). But when you needed something produced by the side effect itself (a DB-generated ID, a timestamp set by the storage layer, a merged entity after a partial update), you were stuck. The f-merge escape hatch existed: a closing function in the effects description that could amend the response after execution. But using f-merge essentially reintroduced in-place mutation, defeating the purpose of the pure/impure split.

Verb-oriented patterns. Every pattern was a set of function calls. Reading all items called a function. Reading one item called a different function with a :read action. Creating called the same function with a :save action. The case dispatch inside each :with function grew as operations multiplied. The pattern language was supposed to describe data, but it was describing procedure calls.

Authorization at two granularities. with-role gated access to the entire data map (coarse). But ownership enforcement (can this user edit this specific item?) had to live inside the :with function's case dispatch (fine). These were two different authorization mechanisms in two different places, with no intermediate layer for "can mutate, but only own entities."

Indirection through context-of. The modifier/finalizer mechanism in context-of was well-designed for what it did: accumulate effects and session updates during pattern resolution without side effects. But it was a layer you had to understand to trace a request end-to-end. Each operation returned {:response :effects :session :error}, the modifier unpacked those, and the finalizer attached the accumulations. The mechanics were sound, but the indirection meant debugging required following the data through several stages.

The saturn handler pattern achieved something valuable: a fully testable, purely functional request handler. The redesign was not about fixing a broken system. It was about recognizing that once collections replaced functions as the API's building blocks, the pure/impure split could happen at a different boundary (inside DataSource methods), and the accumulation machinery was no longer needed.

The new model: patterns match data

The rewrite inverted the relationship. Instead of patterns calling functions, patterns match against data structures. Collections implement ILookup (Clojure's get protocol) for reads and a Mutable protocol for writes. The pattern engine does not know about functions. It just walks a data structure.

Here are the same operations in the new model.

List all dashboards:

'{:user {:dashboards ?all}}

:user is a top-level key in the API map. If the session has the user role, it resolves to a map containing :dashboards. If not, it resolves to nil. ?all is a variable that binds to (seq dashboards), triggering list-all on the DataSource.

Read by ID:

'{:user {:dashboards {{:id $id} ?dash}}}
;; client sends: {:pattern ... :params {:id 123}}

{:id $id} is a lookup key. $id gets replaced with 123 before the pattern compiles. The collection's ILookup implementation receives {:id 123} and delegates to the DataSource's fetch method.

Create:

{:user {:dashboards {nil {:title "New" :content "..."}}}}

nil as a key means "create". The collection's Mutable implementation calls create! on the DataSource. The response is the full created entity.

No :with, no action keywords, no case dispatch. The pattern syntax itself encodes the operation: ?var means read, nil key means create, nil value means delete, key + value means update.

On the server, the data map is a structure of collections, not functions:

(defn make-api [{:keys [storage cache]}]
  (let [dashboards (coll/collection (->DashboardSource storage cache)
                                    {:id-key :id
                                     :indexes #{#{:id}}})]
    (fn [ring-request]
      (let [session (:session ring-request)]
        {:data   {:user  (when (:user session)
                           {:dashboards dashboards})
                  :owner (when (:owner session)
                           {:users users-collection
                            :roles roles-collection})}
         :schema {:user  {:dashboards [:vector Dashboard]}
                  :owner {:users [:vector User]}}
         :errors {:detect :error
                  :codes  {:forbidden 403 :not-found 404}}}))))

Side by side

The contrast is clearest when you see old and new patterns next to each other.

Simple read (list all)

;; OLD: two nested function calls
{:dashboards
 {(list :role/user :with [])
  {(list :self :with []) [{:title '? :id '?}]}}}

;; NEW: structural traversal
'{:user {:dashboards ?all}}

The old pattern needed two :with calls just to list everything: one for role checking, one for the listing function. The new pattern walks a data structure. If :user exists in the API map, :dashboards is a collection, and ?all binds to its contents.

Read by ID with parameters

;; OLD: function call with arguments
{:dashboards
 {(list :role/user :with [])
  {(list :dashboard :with [{:id 123} :read])
   {:title '? :content '?}}}}

;; NEW: indexed lookup with $params
'{:user {:dashboards {{:id $id} ?dash}}}
;; params: {:id 123}

:with [{:id 123} :read] called a function and passed it two arguments. {:id $id} is text substitution: $id becomes 123, then {:id 123} is used as a lookup key on the collection. The difference is that $params happens before pattern compilation. There is no function call in the pattern at all.

Create

;; OLD: function call with :save action
{:dashboards
 {(list :role/user :with [])
  {(list :dashboard :with [{:title "New" :content "..."} :save])
   {:id '? :title '?}}}}

;; NEW: nil key = create
{:user {:dashboards {nil {:title "New" :content "..."}}}}

The old model used the same function for reads and writes, distinguished by an action keyword (:read, :save, :delete). The new model uses structural conventions: nil as the key means create. The collection's Mutable protocol handles it.

Delete

;; OLD: function call with :delete action
{:dashboards
 {(list :role/user :with [])
  {(list :dashboard :with [{:id 123} :delete])
   {:id '?}}}}

;; NEW: query key + nil value = delete
{:user {:dashboards {{:id 123} nil}}}

nil as the value means delete. No action keywords, no function dispatch.

Analytics query (complex parameters)

;; OLD: arbitrary query object via :with
{:analytics
 {(list :raw :with [{:data-source [:module-1 :stats]
                      :select :col-name
                      :time-range {:from "2026-01-01" :to "2026-02-01"}}])
  '?}}

;; NEW: query object as lookup key via $params
'{:user {:analytics {$query ?result}}}
;; params: {:query {:data-source [:module-1 :stats]
;;                  :select :col-name
;;                  :time-range {:from "2026-01-01" :to "2026-02-01"}}}

The query object is the same in both cases. The difference is where it lives: inside a function call (old) versus as a lookup key (new). The DataSource's fetch method receives the full query map and routes internally.

What we gained

Authorization is structural, not functional

Old: (with-role session :user (fn [] ...)) wraps a thunk. Authorization is a function that gates access to other functions.

New: top-level keys in the API map are nil when the session lacks the role. The pattern simply gets nil for unauthorized paths. No function call, no wrapper.

;; Session has :user but not :owner
{:data {:user  {:dashboards dashboards}   ;; present
        :owner nil}}                       ;; nil: patterns against :owner return nothing

For finer-grained checks (ownership enforcement on mutations), wrap-mutable intercepts write operations:

(coll/wrap-mutable dashboards
  (fn [inner query value]
    (if (owns? session query)
      (coll/mutate! inner query value)
      {:error {:type :forbidden}})))

This is still structural: a decorator around a collection, not conditional logic inside a handler.

$params replaces :with

:with called a function with arguments at pattern-resolution time. $params does text substitution before the pattern is even compiled.

;; $params: symbol replacement before compilation
'{:users {{:id $uid} ?user}}
;; + params {:uid 123}
;; becomes: {:users {{:id 123} ?user}}

The pattern engine never sees $uid. By the time it runs, the pattern is pure data. This means patterns are always static structures from the engine's perspective, which simplifies the implementation and makes patterns easier to reason about.

No context-of, modifier, or finalizer

The old context-of mechanism was well-engineered: modifier functions extracted :response/:effects/:session from each operation, accumulated them in transient collections, and the finalizer attached them to the result. The saturn handler stayed pure throughout. It was a clean solution to the problem of accumulating side-effect descriptions during pattern resolution.

The new system does not need any of it:

  • Side effects happen inside DataSource methods (not returned as data)
  • Sessions are managed by Ring middleware (not returned from patterns)
  • The pattern result is the final response (no post-processing)

The tradeoff: the saturn handler's strict pure/impure boundary is gone. DataSource methods perform side effects directly, which means the handler is no longer purely functional. In practice, this turned out to be acceptable because DataSource implementations are small, focused, and testable in isolation. The purity moved from the handler level to the collection wrapper level (decorators like wrap-mutable and read-only are pure transformations).

Side effects live inside DataSource

Old: functions returned {:response ... :effects {:rama {...}}}. The saturn handler accumulated these descriptions. A separate executor ran them afterward. The handler was purely functional.

New: create!, update!, and delete! in DataSource perform the side effects directly. The return value is the entity itself, not a description of work to be done.

(defrecord DashboardSource [storage cache]
  coll/DataSource
  (create! [_ data]
    (storage-append! storage [data :save])
    (assoc data :id (generate-id) :created-at (now)))
  (delete! [_ query]
    (storage-append! storage [query :delete])
    true))

This solves the "response before effects" problem directly: create! performs the write and returns the full entity with DB-generated fields. No f-merge, no two-phase response construction.

The tradeoff is that the handler is no longer purely functional. If you need the old effects-description pattern for testing, you can wrap the DataSource to capture effects without executing them. But the default path is direct execution, which is simpler to trace.

Error handling as data

Collections return errors as plain maps:

{:error {:type :forbidden :message "You don't own this resource"}}
{:error {:type :not-found}}

The remote layer maps error types to HTTP status codes via a declarative config:

{:detect :error
 :codes  {:forbidden 403 :not-found 404 :invalid-mutation 422}}

This keeps collections pure (they return data describing what happened) while the transport layer decides how to represent it. The design is heading toward GraphQL-style partial responses, where one branch failing does not fail the whole pattern. A request for {:user ?data :admin ?admin-stuff} should return :user data even if :admin is forbidden, with errors collected in a top-level array alongside the data.

Conclusion

The old saturn handler architecture was a genuinely clean design: a purely functional handler, effects as data descriptions, executors as the only impure component. It achieved testability and separation of concerns that many web frameworks do not even attempt.

The redesign was not about fixing something broken. It was about moving the purity boundary. The saturn handler kept the entire request pipeline pure by deferring effects. The new model keeps collections and their wrappers pure by pushing side effects into DataSource methods. The accumulation machinery (context-of, modifier, finalizer) disappears because there is nothing to accumulate. The response-before-effects limitation disappears because create! returns the entity directly.

The deeper lesson is about API identity. When your API is a set of handler functions, cross-cutting concerns (authorization, transport, error handling) become imperative code woven through those handlers. When your API is a data structure, those same concerns become structural: the shape of the map enforces authorization, the protocols enforce CRUD semantics, and the transport layer works generically over any ILookup-compatible structure.

Verbs become nouns, and the nouns compose.

Permalink

Clojure Protocols and the Decorator Pattern

Context

Clojure's built-in functions work on built-in types because those types implement specific Java interfaces. get works on maps because maps implement ILookup. seq works on vectors because vectors implement Seqable. count works on both because they implement Counted.

The interesting part: your custom types can implement the same interfaces. Once they do, Clojure's standard library treats them as first-class citizens. get, seq, map, filter, count all work transparently, no special dispatch, no wrapper functions.

The lasagna-pattern collection library (Clojars) does exactly this. It defines a Collection type backed by a database, then implements ILookup and Seqable so that (get coll {:post/id 3}) triggers a database query while looking like a plain map lookup to the caller. The companion article, Building a Pure Data API with Lasagna Pattern, covers the full architecture. This article focuses on the Clojure constructs that make it work.

The four constructs

Clojure provides four ways to define types that implement protocols and interfaces. Each serves a different purpose.

defprotocol: the contract

Defines method signatures with no implementation. Conceptually similar to a Java interface.

(defprotocol DataSource
  (fetch [this query])
  (list-all [this])
  (create! [this data])
  (update! [this query data])
  (delete! [this query]))

This says: "any storage backend must support these 5 operations." It does not say how. The implementation is left to the types that satisfy the protocol.

defrecord: named, map-like type

A concrete implementation of a protocol. Has named fields and behaves like a Clojure map (you can assoc, dissoc, and destructure it).

(defrecord PostsDataSource [conn]
  DataSource
  (fetch [_ query]    (d/q ... @conn))
  (list-all [_]       (d/q ... @conn))
  (create! [_ data]   (d/transact conn [data]))
  (update! [_ q data] (d/transact conn [(merge ...)]))
  (delete! [_ query]  (d/transact conn [[:db/retractEntity ...]])))

Use defrecord for persistent, reusable implementations with named fields: storage backends, services, configuration holders.

deftype: named, not map-like

Like defrecord but without map behavior. Used for structural wrappers that implement platform interfaces rather than domain protocols.

(deftype Collection [data-source id-key indexes]
  clojure.lang.ILookup
  (valAt [this q] (.valAt this q nil))
  (valAt [_ q nf] (or (fetch data-source q) nf))

  clojure.lang.Seqable
  (seq [_] (seq (list-all data-source))))

Use deftype when you need to override built-in Clojure verbs (get, seq, count). The type itself is opaque. Callers interact with it through standard Clojure functions, not through field access.

reify: anonymous, inline type

Same capability as deftype but anonymous and created inline. Closes over local variables.

(defn profile-lookup [session]
  (reify clojure.lang.ILookup
    (valAt [this k] (.valAt this k nil))
    (valAt [_ k nf]
      (case k
        :name  (:user-name session)
        :email (:user-email session)
        nf))))

Use for one-off objects, per-request wrappers, or cases where a named type would be overkill. The session value is captured from the enclosing scope.

Summary table

Construct What it is When to use
defprotocol Contract (method signatures) Define a role: "what must a DataSource do?"
defrecord Named type, map-like Concrete implementations: PostsDataSource
deftype Named type, not map-like Structural wrappers: Collection
reify Anonymous inline type One-off objects: per-request lookups

Overriding built-in verbs

Each Clojure interface corresponds to a built-in verb. Implementing an interface teaches Clojure how your custom type responds to that verb.

ILookup: powers get

When you call (get thing key), Clojure calls (.valAt thing key nil) under the hood. Maps implement this by default. Custom types do not.

;; Without ILookup
(deftype Box [x])
(get (->Box 42) :x)  ;; => nil (Box doesn't implement ILookup)

;; With ILookup
(deftype SmartBox [x y]
  clojure.lang.ILookup
  (valAt [this k] (.valAt this k nil))
  (valAt [_ k nf]
    (case k :x x :y y nf)))

(get (->SmartBox 1 2) :x)  ;; => 1

In the collection library, ILookup is what makes (get coll {:post/id 3}) trigger a database query. The caller writes standard Clojure. The collection translates the get call into a fetch on the underlying DataSource.

Seqable: powers seq (and map, filter, etc.)

clojure.lang.Seqable
(seq [_] (seq (list-all data-source)))

Once a type implements Seqable, all sequence functions work: (seq coll), (map f coll), (filter pred coll). The collection becomes iterable by delegating to its DataSource's list-all.

Counted: powers count directly

clojure.lang.Counted
(count [_] (count (list-all data-source)))

Without Counted, calling count on a custom Seqable type throws UnsupportedOperationException. Clojure's RT.count() does not fall back to seq. It only works on types that implement Counted, IPersistentCollection, java.util.Collection, or a few other JDK interfaces. If your custom type needs to support count, implement Counted explicitly. This also lets you provide an optimized path (e.g., a SELECT COUNT(*) instead of fetching all rows).

Custom protocols

The interfaces above override Clojure's built-in verbs. But some operations have no built-in verb. The collection library defines two custom protocols for these cases.

Protocol Verb Purpose
Mutable mutate! Unified CRUD: (nil, data) = create, (query, data) = update, (query, nil) = delete
Wireable ->wire Serialize for HTTP transport: collections become vectors, lookups become maps or nil

mutate! unifies create, update, and delete into a single function. The operation is determined by the combination of arguments: nil query means create, nil value means delete, both present means update.

Wireable is conceptually similar to clojure.core.protocols/Datafiable (datafy). Both turn opaque types into plain Clojure data. The difference is intent: datafy is for introspection and navigation, ->wire is specifically for HTTP serialization.

The decorator pattern

Here is the key design insight: one DataSource, one Collection, multiple wrappers per role.

Without wrappers (bad)

;; 3 records duplicating the same Datahike queries
(defrecord GuestPostsDataSource [conn] ...)
(defrecord MemberPostsDataSource [conn] ...)
(defrecord AdminPostsDataSource [conn] ...)

Each record contains a full copy of the same fetch, list-all, create!, update!, and delete! logic. Domain logic changes must be applied to all three.

With wrappers (good)

(def posts (db/posts conn))               ;; one DataSource, one Collection

(public-posts posts)                       ;; reify: override get/seq to strip email
(member-posts posts user-id email)         ;; wrap-mutable: override mutate! for ownership
posts                                      ;; admin: no wrapper needed

The DataSource is created once. Each role gets a thin wrapper that overrides only the behavior it needs. Reads, storage queries, and domain logic live in one place.

Wrapper functions

Wrapper What it overrides Use case
coll/read-only Removes Mutable entirely Guest access (no writes)
coll/wrap-mutable Overrides mutate!, delegates reads Ownership enforcement
reify (manual) Override any interface Transform read results, composite routing
coll/lookup Provides ILookup from a keyword-value map Non-enumerable resources (profile, session data)

Example: building views per role

(let [posts (db/posts conn)]    ;; one record, created once

  {:guest  {:posts (public-posts posts)}            ;; reify over read-only, strips :user/email
   :member {:posts (member-posts posts uid email)}  ;; wrap-mutable, ownership checks
   :admin  {:posts posts}                           ;; raw collection, full access
   :owner  {:users (coll/read-only (db/users conn))}})

Guests see a read-only view with PII stripped. Members see a mutable view that enforces ownership. Admins see the raw collection. Each wrapper does one thing.

The public-posts wrapper demonstrates how reify serves as the escape hatch when the built-in wrappers are not enough:

(defn- public-posts [posts]
  (let [inner (coll/read-only posts)]
    (reify
      clojure.lang.ILookup
      (valAt [_ query]
        (when-let [post (.valAt inner query)]
          (strip-author-email post)))

      clojure.lang.Seqable
      (seq [_]
        (map strip-author-email (seq inner))))))

The library provides read-only (restricts writes) and wrap-mutable (intercepts writes), but no built-in way to transform read results. For that, you implement ILookup and Seqable directly via reify.

Three layers of authorization

Authorization in this pattern is distributed structurally rather than imperatively. Instead of a single middleware that checks permissions, three layers each handle a different granularity.

Coarse: with-role (API map structure)

Binary gate: you have the role or you don't. The entire subtree of collections is present or replaced with an error map.

(defn- with-role [session role data]
  (if (contains? (:roles session) role)
    data
    {:error {:type :forbidden :message (str "Role " role " required")}}))

;; In make-api:
:owner (with-role session :owner
         {:users users, :users/roles roles})

A non-owner sending '{:owner {:users ?all}} hits the error map, not the collection. The remote/ layer detects errors along variable paths, so the error flows through as inline data and prevents any mutation from being attempted.

A planned improvement is an error-gate function that replaces the plain map with a reify implementing ILookup (returns self for any key, so deeply nested pattern traversal keeps working), Mutable (returns the error for mutations), and Wireable (serializes as the error map). This would be a good example of composing three protocols into a single anonymous sentinel object.

Medium: wrap-mutable (per-entity mutation rules)

Controls who can create, update, or delete specific entities:

(coll/wrap-mutable posts
  (fn [posts query value]
    (if (owns-post? posts user-email query)
      (coll/mutate! posts query value)
      {:error {:type :forbidden}})))

Reads pass through untouched. Only mutations are intercepted. The check is per-entity: does this user own this specific post?

Fine: reify decorator (field-level read transformation)

Controls which fields are visible:

(public-posts posts)  ;; strips :user/email from author on every read

Every get and seq call on this wrapper runs through a transformation function that removes sensitive fields before the data reaches the caller.

Authorization summary

Layer Tool What it guards Example
Coarse with-role "Can you access :owner at all?" Non-owners get error map
Medium wrap-mutable "Can you mutate this entity?" Members can only edit own posts
Fine reify decorator "What fields can you see?" Guests don't see author email

The DataSource stays "dumb" about authorization. It only knows about storage. This keeps it reusable across all roles without conditional logic.

When to skip the full stack

Not everything needs defrecord + DataSource + Collection. If a resource is read-only, non-enumerable, and has a single query shape, a raw reify implementing ILookup + Wireable is enough.

Example: a post history lookup that takes a post ID and returns the revision history:

(defn post-history-lookup [conn]
  (reify
    clojure.lang.ILookup
    (valAt [_ query]
      (when-let [post-id (:post/id query)]
        (post-history @conn post-id)))
    (valAt [this query not-found]
      (or (.valAt this query) not-found))

    coll/Wireable
    (->wire [_] nil)))  ;; can't enumerate all history

The pattern engine still calls get on it, so it works identically from the caller's perspective. The full DataSource/Collection stack would add index validation, Seqable, Mutable, none of which history needs.

Decision guide

Need Tool
Full CRUD + enumeration + index validation defrecord + coll/collection
Read-only, keyword keys, flat values coll/lookup
Read-only, map keys, single query shape Raw reify with ILookup + Wireable

Note: coll/lookup only supports keyword keys (:email, :name). For map keys like {:post/id 3}, use a raw reify.

Permalink

Build and Deploy Web Apps With Clojure and FLy.io

This post walks through a small web development project using Clojure, covering everything from building the app to packaging and deploying it. It’s a collection of insights and tips I’ve learned from building my Clojure side projects, but presented in a more structured format.

As the title suggests, we’ll be deploying the app to Fly.io. It’s a service that allows you to deploy apps packaged as Docker images on lightweight virtual machines. [1] [1] My experience with it has been good; it’s easy to use and quick to set up. One downside of Fly is that it doesn’t have a free tier, but if you don’t plan on leaving the app deployed, it barely costs anything.

This isn’t a tutorial on Clojure, so I’ll assume you already have some familiarity with the language as well as some of its libraries. [2] [2]

Project Setup

In this post, we’ll be building a barebones bookmarks manager for the demo app. Users can log in using basic authentication, view all bookmarks, and create a new bookmark. It’ll be a traditional multi-page web app and the data will be stored in a SQLite database.

Here’s an overview of the project’s starting directory structure:

.
├── dev
│   └── user.clj
├── resources
│   └── config.edn
├── src
│   └── acme
│       └── main.clj
└── deps.edn

And the libraries we’re going to use. If you have some Clojure experience or have used Kit, you’re probably already familiar with all the libraries listed below. [3] [3]

deps.edn
{:paths ["src" "resources"]
 :deps {org.clojure/clojure               {:mvn/version "1.12.0"}
        aero/aero                         {:mvn/version "1.1.6"}
        integrant/integrant               {:mvn/version "0.11.0"}
        ring/ring-jetty-adapter           {:mvn/version "1.12.2"}
        metosin/reitit-ring               {:mvn/version "0.7.2"}
        com.github.seancorfield/next.jdbc {:mvn/version "1.3.939"}
        org.xerial/sqlite-jdbc            {:mvn/version "3.46.1.0"}
        hiccup/hiccup                     {:mvn/version "2.0.0-RC3"}}
 :aliases
 {:dev {:extra-paths ["dev"]
        :extra-deps  {nrepl/nrepl    {:mvn/version "1.3.0"}
                      integrant/repl {:mvn/version "0.3.3"}}
        :main-opts   ["-m" "nrepl.cmdline" "--interactive" "--color"]}}}

I use Aero and Integrant for my system configuration (more on this in the next section), Ring with the Jetty adaptor for the web server, Reitit for routing, next.jdbc for database interaction, and Hiccup for rendering HTML. From what I’ve seen, this is a popular “library combination” for building web apps in Clojure. [4] [4]

The user namespace in dev/user.clj contains helper functions from Integrant-repl to start, stop, and restart the Integrant system.

dev/user.clj
(ns user
  (:require
   [acme.main :as main]
   [clojure.tools.namespace.repl :as repl]
   [integrant.core :as ig]
   [integrant.repl :refer [set-prep! go halt reset reset-all]]))

(set-prep!
 (fn []
   (ig/expand (main/read-config)))) ;; we'll implement this soon

(repl/set-refresh-dirs "src" "resources")

(comment
  (go)
  (halt)
  (reset)
  (reset-all))

Systems and Configuration

If you’re new to Integrant or other dependency injection libraries like Component, I’d suggest reading “How to Structure a Clojure Web”. It’s a great explanation of the reasoning behind these libraries. Like most Clojure apps that use Aero and Integrant, my system configuration lives in a .edn file. I usually name mine as resources/config.edn. Here’s what it looks like:

resources/config.edn
{:server
 {:port #long #or [#env PORT 8080]
  :host #or [#env HOST "0.0.0.0"]
  :auth {:username #or [#env AUTH_USER "john.doe@email.com"]
         :password #or [#env AUTH_PASSWORD "password"]}}

 :database
 {:dbtype "sqlite"
  :dbname #or [#env DB_DATABASE "database.db"]}}

In production, most of these values will be set using environment variables. During local development, the app will use the hard-coded default values. We don’t have any sensitive values in our config (e.g., API keys), so it’s fine to commit this file to version control. If there are such values, I usually put them in another file that’s not tracked by version control and include them in the config file using Aero’s #include reader tag.

This config file is then “expanded” into the Integrant system map using the expand-key method:

src/acme/main.clj
(ns acme.main
  (:require
   [aero.core :as aero]
   [clojure.java.io :as io]
   [integrant.core :as ig]))

(defn read-config
  []
  {:system/config (aero/read-config (io/resource "config.edn"))})

(defmethod ig/expand-key :system/config
  [_ opts]
  (let [{:keys [server database]} opts]
    {:server/jetty (assoc server :handler (ig/ref :handler/ring))
     :handler/ring {:database (ig/ref :database/sql)
                    :auth     (:auth server)}
     :database/sql database}))

The system map is created in code instead of being in the configuration file. This makes refactoring your system simpler as you only need to change this method while leaving the config file (mostly) untouched. [5] [5]

My current approach to Integrant + Aero config files is mostly inspired by the blog post “Rethinking Config with Aero & Integrant” and Laravel’s configuration. The config file follows a similar structure to Laravel’s config files and contains the app configurations without describing the structure of the system. Previously, I had a key for each Integrant component, which led to the config file being littered with #ig/ref and more difficult to refactor.

Also, if you haven’t already, start a REPL and connect to it from your editor. Run clj -M:dev if your editor doesn’t automatically start a REPL. Next, we’ll implement the init-key and halt-key! methods for each of the components:

src/acme/main.clj
;; src/acme/main.clj
(ns acme.main
  (:require
   ;; ...
   [acme.handler :as handler]
   [acme.util :as util])
   [next.jdbc :as jdbc]
   [ring.adapter.jetty :as jetty]))
;; ...

(defmethod ig/init-key :server/jetty
  [_ opts]
  (let [{:keys [handler port]} opts
        jetty-opts (-> opts (dissoc :handler :auth) (assoc :join? false))
        server     (jetty/run-jetty handler jetty-opts)]
    (println "Server started on port " port)
    server))

(defmethod ig/halt-key! :server/jetty
  [_ server]
  (.stop server))

(defmethod ig/init-key :handler/ring
  [_ opts]
  (handler/handler opts))

(defmethod ig/init-key :database/sql
  [_ opts]
  (let [datasource (jdbc/get-datasource opts)]
    (util/setup-db datasource)
    datasource))

The setup-db function creates the required tables in the database if they don’t exist yet. This works fine for database migrations in small projects like this demo app, but for larger projects, consider using libraries such as Migratus (my preferred library) or Ragtime.

src/acme/util.clj
(ns acme.util 
  (:require
   [next.jdbc :as jdbc]))

(defn setup-db
  [db]
  (jdbc/execute-one!
   db
   ["create table if not exists bookmarks (
       bookmark_id text primary key not null,
       url text not null,
       created_at datetime default (unixepoch()) not null
     )"]))

For the server handler, let’s start with a simple function that returns a “hi world” string.

src/acme/handler.clj
(ns acme.handler
  (:require
   [ring.util.response :as res]))

(defn handler
  [_opts]
  (fn [req]
    (res/response "hi world")))

Now all the components are implemented. We can check if the system is working properly by evaluating (reset) in the user namespace. This will reload your files and restart the system. You should see this message printed in your REPL:

:reloading (acme.util acme.handler acme.main)
Server started on port  8080
:resumed

If we send a request to http://localhost:8080/, we should get “hi world” as the response:

$ curl localhost:8080/
# hi world

Nice! The system is working correctly. In the next section, we’ll implement routing and our business logic handlers.

Routing, Middleware, and Route Handlers

First, let’s set up a ring handler and router using Reitit. We only have one route, the index / route that’ll handle both GET and POST requests.

src/acme/handler.clj
(ns acme.handler
  (:require
   [reitit.ring :as ring]))

(def routes
  [["/" {:get  index-page
         :post index-action}]])

(defn handler
  [opts]
  (ring/ring-handler
   (ring/router routes)
   (ring/routes
    (ring/redirect-trailing-slash-handler)
    (ring/create-resource-handler {:path "/"})
    (ring/create-default-handler))))

We’re including some useful middleware:

  • redirect-trailing-slash-handler to resolve routes with trailing slashes,
  • create-resource-handler to serve static files, and
  • create-default-handler to handle common 40x responses.

Implementing the Middlewares

If you remember the :handler/ring from earlier, you’ll notice that it has two dependencies, database and auth. Currently, they’re inaccessible to our route handlers. To fix this, we can inject these components into the Ring request map using a middleware function.

src/acme/handler.clj
;; ...

(defn components-middleware
  [components]
  (let [{:keys [database auth]} components]
    (fn [handler]
      (fn [req]
        (handler (assoc req
                        :db database
                        :auth auth))))))
;; ...

The components-middleware function takes in a map of components and creates a middleware function that “assocs” each component into the request map. [6] [6] If you have more components such as a Redis cache or a mail service, you can add them here.

We’ll also need a middleware to handle HTTP basic authentication. [7] [7] This middleware will check if the username and password from the request map match the values in the auth map injected by components-middleware. If they match, then the request is authenticated and the user can view the site.

src/acme/handler.clj
(ns acme.handler
  (:require
   ;; ...
   [acme.util :as util]
   [ring.util.response :as res]))
;; ...

(defn wrap-basic-auth
  [handler]
  (fn [req]
    (let [{:keys [headers auth]} req
          {:keys [username password]} auth
          authorization (get headers "authorization")
          correct-creds (str "Basic " (util/base64-encode
                                       (format "%s:%s" username password)))]
      (if (and authorization (= correct-creds authorization))
        (handler req)
        (-> (res/response "Access Denied")
            (res/status 401)
            (res/header "WWW-Authenticate" "Basic realm=protected"))))))
;; ...

A nice feature of Clojure is that interop with the host language is easy. The base64-encode function is just a thin wrapper over Java’s Base64.Encoder:

src/acme/util.clj
(ns acme.util
   ;; ...
  (:import java.util.Base64))

(defn base64-encode
  [s]
  (.encodeToString (Base64/getEncoder) (.getBytes s)))

Finally, we need to add them to the router. Since we’ll be handling form requests later, we’ll also bring in Ring’s wrap-params middleware.

src/acme/handler.clj
(ns acme.handler
  (:require
   ;; ...
   [ring.middleware.params :refer [wrap-params]]))
;; ...

(defn handler
  [opts]
  (ring/ring-handler
   ;; ...
   {:middleware [(components-middleware opts)
                 wrap-basic-auth
                 wrap-params]}))

Implementing the Route Handlers

We now have everything we need to implement the route handlers or the business logic of the app. First, we’ll implement the index-page function, which renders a page that:

  1. Shows all of the user’s bookmarks in the database, and
  2. Shows a form that allows the user to insert new bookmarks into the database
src/acme/handler.clj
(ns acme.handler
  (:require
   ;; ...
   [next.jdbc :as jdbc]
   [next.jdbc.sql :as sql]))
;; ...

(defn template
  [bookmarks]
  [:html
   [:head
    [:meta {:charset "utf-8"
            :name    "viewport"
            :content "width=device-width, initial-scale=1.0"}]]
   [:body
    [:h1 "bookmarks"]
    [:form {:method "POST"}
     [:div
      [:label {:for "url"} "url "]
      [:input#url {:name "url"
                   :type "url"
                   :required true
                   :placeholer "https://en.wikipedia.org/"}]]
     [:button "submit"]]
    [:p "your bookmarks:"]
    [:ul
     (if (empty? bookmarks)
       [:li "you don't have any bookmarks"]
       (map
        (fn [{:keys [url]}]
          [:li
           [:a {:href url} url]])
        bookmarks))]]])

(defn index-page
  [req]
  (try
    (let [bookmarks (sql/query (:db req)
                               ["select * from bookmarks"]
                               jdbc/unqualified-snake-kebab-opts)]
      (util/render (template bookmarks)))
    (catch Exception e
      (util/server-error e))))
;; ...

Database queries can sometimes throw exceptions, so it’s good to wrap them in a try-catch block. I’ll also introduce some helper functions:

src/acme/util.clj
(ns acme.util
  (:require
   ;; ...
   [hiccup2.core :as h]
   [ring.util.response :as res])
  (:import java.util.Base64))
;; ...

(defn preprend-doctype
  [s]
  (str "<!doctype html>" s))

(defn render
  [hiccup]
  (-> hiccup h/html str preprend-doctype res/response (res/content-type "text/html")))

(defn server-error
  [e]
  (println "Caught exception: " e)
  (-> (res/response "Internal server error")
      (res/status 500)))

render takes a hiccup form and turns it into a ring response, while server-error takes an exception, logs it, and returns a 500 response.

Next, we’ll implement the index-action function:

src/acme/handler.clj
;; ...

(defn index-action
  [req]
  (try
    (let [{:keys [db form-params]} req
          value (get form-params "url")]
      (sql/insert! db :bookmarks {:bookmark_id (random-uuid) :url value})
      (res/redirect "/" 303))
    (catch Exception e
      (util/server-error e))))
;; ...

This is an implementation of a typical post/redirect/get pattern. We get the value from the URL form field, insert a new row in the database with that value, and redirect back to the index page. Again, we’re using a try-catch block to handle possible exceptions from the database query.

That should be all of the code for the controllers. If you reload your REPL and go to http://localhost:8080, you should see something that looks like this after logging in:

Screnshot of the app

The last thing we need to do is to update the main function to start the system:

src/acme/main.clj
;; ...

(defn -main [& _]
  (-> (read-config) ig/expand ig/init))

Now, you should be able to run the app using clj -M -m acme.main. That’s all the code needed for the app. In the next section, we’ll package the app into a Docker image to deploy to Fly.

Packaging the App

While there are many ways to package a Clojure app, Fly.io specifically requires a Docker image. There are two approaches to doing this:

  1. Build an uberjar and run it using Java in the container, or
  2. Load the source code and run it using Clojure in the container

Both are valid approaches. I prefer the first since its only dependency is the JVM. We’ll use the tools.build library to build the uberjar. Check out the official guide for more information on building Clojure programs. Since it’s a library, to use it, we can add it to our deps.edn file with an alias:

deps.edn
{;; ...
 :aliases
 {;; ...
  :build {:extra-deps {io.github.clojure/tools.build 
                       {:git/tag "v0.10.5" :git/sha "2a21b7a"}}
          :ns-default build}}}

Tools.build expects a build.clj file in the root of the project directory, so we’ll need to create that file. This file contains the instructions to build artefacts, which in our case is a single uberjar. There are many great examples of build.clj files on the web, including from the official documentation. For now, you can copy+paste this file into your project.

build.clj
(ns build
  (:require
   [clojure.tools.build.api :as b]))

(def basis (delay (b/create-basis {:project "deps.edn"})))
(def src-dirs ["src" "resources"])
(def class-dir "target/classes")

(defn uber
  [_]
  (println "Cleaning build directory...")
  (b/delete {:path "target"})

  (println "Copying files...")
  (b/copy-dir {:src-dirs   src-dirs
               :target-dir class-dir})

  (println "Compiling Clojure...")
  (b/compile-clj {:basis      @basis
                  :ns-compile '[acme.main]
                  :class-dir  class-dir})

  (println "Building Uberjar...")
  (b/uber {:basis     @basis
           :class-dir class-dir
           :uber-file "target/standalone.jar"
           :main      'acme.main}))

To build the project, run clj -T:build uber. This will create the uberjar standalone.jar in the target directory. The uber in clj -T:build uber refers to the uber function from build.clj. Since the build system is a Clojure program, you can customise it however you like. If we try to run the uberjar now, we’ll get an error:

# build the uberjar
$ clj -T:build uber
# Cleaning build directory...
# Copying files...
# Compiling Clojure...
# Building Uberjar...

# run the uberjar
$ java -jar target/standalone.jar
# Error: Could not find or load main class acme.main
# Caused by: java.lang.ClassNotFoundException: acme.main

This error occurred because the Main class that is required by Java isn’t built. To fix this, we need to add the :gen-class directive in our main namespace. This will instruct Clojure to create the Main class from the -main function.

src/acme/main.clj
(ns acme.main
  ;; ...
  (:gen-class))
;; ...

If you rebuild the project and run java -jar target/standalone.jar again, it should work perfectly. Now that we have a working build script, we can write the Dockerfile:

Dockerfile
# install additional dependencies here in the base layer
# separate base from build layer so any additional deps installed are cached
FROM clojure:temurin-21-tools-deps-bookworm-slim AS base

FROM base as build
WORKDIR /opt
COPY . .
RUN clj -T:build uber

FROM eclipse-temurin:21-alpine AS prod
COPY --from=build /opt/target/standalone.jar /
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "standalone.jar"]

It’s a multi-stage Dockerfile. We use the official Clojure Docker image as the layer to build the uberjar. Once it’s built, we copy it to a smaller Docker image that only contains the Java runtime. [8] [8] By doing this, we get a smaller container image as well as a faster Docker build time because the layers are better cached.

That should be all for packaging the app. We can move on to the deployment now.

Deploying with Fly.io

First things first, you’ll need to install flyctl, Fly’s CLI tool for interacting with their platform. Create a Fly.io account if you haven’t already. Then run fly auth login to authenticate flyctl with your account.

Next, we’ll need to create a new Fly App:

$ fly app create
# ? Choose an app name (leave blank to generate one): 
# automatically selected personal organization: Ryan Martin
# New app created: blue-water-6489

Another way to do this is with the fly launch command, which automates a lot of the app configuration for you. We have some steps to do that are not done by fly launch, so we’ll be configuring the app manually. I also already have a fly.toml file ready that you can straight away copy to your project.

fly.toml
# replace these with your app and region name
# run `fly platform regions` to get a list of regions
app = 'blue-water-6489' 
primary_region = 'sin'

[env]
  DB_DATABASE = "/data/database.db"

[http_service]
  internal_port = 8080
  force_https = true
  auto_stop_machines = "stop"
  auto_start_machines = true
  min_machines_running = 0

[mounts]
  source = "data"
  destination = "/data"
  initial_sie = 1

[[vm]]
  size = "shared-cpu-1x"
  memory = "512mb"
  cpus = 1
  cpu_kind = "shared"

These are mostly the default configuration values with some additions. Under the [env] section, we’re setting the SQLite database location to /data/database.db. The database.db file itself will be stored in a persistent Fly Volume mounted on the /data directory. This is specified under the [mounts] section. Fly Volumes are similar to regular Docker volumes but are designed for Fly’s micro VMs.

We’ll need to set the AUTH_USER and AUTH_PASSWORD environment variables too, but not through the fly.toml file as these are sensitive values. To securely set these credentials with Fly, we can set them as app secrets. They’re stored encrypted and will be automatically injected into the app at boot time.

$ fly secrets set AUTH_USER=hi@ryanmartin.me AUTH_PASSWORD=not-so-secure-password
# Secrets are staged for the first deployment

With this, the configuration is done and we can deploy the app using fly deploy:

$ fly deploy
# ...
# Checking DNS configuration for blue-water-6489.fly.dev
# Visit your newly deployed app at https://blue-water-6489.fly.dev/

The first deployment will take longer since it’s building the Docker image for the first time. Subsequent deployments should be faster due to the cached image layers. You can click on the link to view the deployed app, or you can also run fly open, which will do the same thing. Here’s the app in action:

The app in action

If you made additional changes to the app or fly.toml, you can redeploy the app using the same command, fly deploy. The app is configured to auto stop/start, which helps to cut costs when there’s not a lot of traffic to the site. If you want to take down the deployment, you’ll need to delete the app itself using fly app destroy <your app name>.

Adding a Production REPL

This is an interesting topic in the Clojure community, with varying opinions on whether or not it’s a good idea. Personally, I find having a REPL connected to the live app helpful, and I often use it for debugging and running queries on the live database. [9] [9] Since we’re using SQLite, we don’t have a database server we can directly connect to, unlike Postgres or MySQL.

If you’re brave, you can even restart the app directly without redeploying from the REPL. You can easily go wrong with it, which is why some prefer not to use it.

For this project, we’re gonna add a socket REPL. It’s very simple to add (you just need to add a JVM option) and it doesn’t require additional dependencies like nREPL. Let’s update the Dockerfile:

Dockerfile
# ...
EXPOSE 7888
ENTRYPOINT ["java", "-Dclojure.server.repl={:port 7888 :accept clojure.core.server/repl}", "-jar", "standalone.jar"]

The socket REPL will be listening on port 7888. If we redeploy the app now, the REPL will be started, but we won’t be able to connect to it. That’s because we haven’t exposed the service through Fly proxy. We can do this by adding the socket REPL as a service in the [services] section in fly.toml.

However, doing this will also expose the REPL port to the public. This means that anyone can connect to your REPL and possibly mess with your app. Instead, what we want to do is to configure the socket REPL as a private service.

By default, all Fly apps in your organisation live in the same private network. This private network, called 6PN, connects the apps in your organisation through WireGuard tunnels (a VPN) using IPv6. Fly private services aren’t exposed to the public internet but can be reached from this private network. We can then use Wireguard to connect to this private network to reach our socket REPL.

Fly VMs are also configured with the hostname fly-local-6pn, which maps to its 6PN address. This is analogous to localhost, which points to your loopback address 127.0.0.1. To expose a service to 6PN, all we have to do is bind or serve it to fly-local-6pn instead of the usual 0.0.0.0. We have to update the socket REPL options to:

Dockerfile
# ...
ENTRYPOINT ["java", "-Dclojure.server.repl={:port 7888,:address \"fly-local-6pn\",:accept clojure.core.server/repl}", "-jar", "standalone.jar"]

After redeploying, we can use the fly proxy command to forward the port from the remote server to our local machine. [10] [10]

$ fly proxy 7888:7888
# Proxying local port 7888 to remote [blue-water-6489.internal]:7888

In another shell, run:

$ rlwrap nc localhost 7888
# user=>

Now we have a REPL connected to the production app! rlwrap is used for readline functionality, e.g. up/down arrow keys, vi bindings. Of course, you can also connect to it from your editor.

Deploy with GitHub Actions

If you’re using GitHub, we can also set up automatic deployments on pushes/PRs with GitHub Actions. All you need is to create the workflow file:

.github/workflows/fly.yaml
name: Fly Deploy
on:
  push:
    branches:
      - main
  workflow_dispatch:

jobs:
  deploy:
    name: Deploy app
    runs-on: ubuntu-latest
    concurrency: deploy-group
    steps:
      - uses: actions/checkout@v4
      - uses: superfly/flyctl-actions/setup-flyctl@master
      - run: flyctl deploy --remote-only
        env:
          FLY_API_TOKEN: ${{ secrets.FLY_API_TOKEN }}

To get this to work, you’ll need to create a deploy token from your app’s dashboard. Then, in your GitHub repo, create a new repository secret called FLY_API_TOKEN with the value of your deploy token. Now, whenever you push to the main branch, this workflow will automatically run and deploy your app. You can also manually run the workflow from GitHub because of the workflow_dispatch option.

End

As always, all the code is available on GitHub. Originally, this post was just about deploying to Fly.io, but along the way, I kept adding on more stuff until it essentially became my version of the user manager example app. Anyway, hope this post provided a good view into web development with Clojure. As a bonus, here are some additional resources on deploying Clojure apps:


  1. The way Fly.io works under the hood is pretty clever. Instead of running the container image with a runtime like Docker, the image is unpacked and “loaded” into a VM. See this video explanation for more details. ↩︎

  2. If you’re interested in learning Clojure, my recommendation is to follow the official getting started guide and join the Clojurians Slack. Also, read through this list of introductory resources. ↩︎

  3. Kit was a big influence on me when I first started learning web development in Clojure. I never used it directly, but I did use their library choices and project structure as a base for my own projects. ↩︎

  4. There’s no “Rails” for the Clojure ecosystem (yet?). The prevailing opinion is to build your own “framework” by composing different libraries together. Most of these libraries are stable and are already used in production by big companies, so don’t let this discourage you from doing web development in Clojure! ↩︎

  5. There might be some keys that you add or remove, but the structure of the config file stays the same. ↩︎

  6. “assoc” (associate) is a Clojure slang that means to add or update a key-value pair in a map. ↩︎

  7. For more details on how basic authentication works, check out the specification. ↩︎

  8. Here’s a cool resource I found when researching Java Dockerfiles: WhichJDK. It provides a comprehensive comparison of the different JDKs available and recommendations on which one you should use. ↩︎

  9. Another (non-technically important) argument for live/production REPLs is just because it’s cool. Ever since I read the story about NASA’s programmers debugging a spacecraft through a live REPL, I’ve always wanted to try it at least once. ↩︎

  10. If you encounter errors related to WireGuard when running fly proxy, you can run fly doctor, which will hopefully detect issues with your local setup and also suggest fixes for them. ↩︎

Permalink

Advent of Code 2024 in Zig

This post is about six seven months late, but here are my takeaways from Advent of Code 2024. It was my second time participating, and this time I actually managed to complete it. [1] [1] My goal was to learn a new language, Zig, and to improve my DSA and problem-solving skills.

If you’re not familiar, Advent of Code is an annual programming challenge that runs every December. A new puzzle is released each day from December 1st to the 25th. There’s also a global leaderboard where people (and AI) race to get the fastest solves, but I personally don’t compete in it, mostly because I want to do it at my own pace.

I went with Zig because I have been curious about it for a while, mainly because of its promise of being a better C and because TigerBeetle (one of the coolest databases now) is written in it. Learning Zig felt like a good way to get back into systems programming, something I’ve been wanting to do after a couple of chaotic years of web development.

This post is mostly about my setup, results, and the things I learned from solving the puzzles. If you’re more interested in my solutions, I’ve also uploaded my code and solution write-ups to my GitHub repository.

My Advent of Code results page

Project Setup

There were several Advent of Code templates in Zig that I looked at as a reference for my development setup, but none of them really clicked with me. I ended up just running my solutions directly using zig run for the whole event. It wasn’t until after the event ended that I properly learned Zig’s build system and reorganised my project.

Here’s what the project structure looks like now:

.
├── src
│   ├── days
│   │   ├── data
│   │   │   ├── day01.txt
│   │   │   ├── day02.txt
│   │   │   └── ...
│   │   ├── day01.zig
│   │   ├── day02.zig
│   │   └── ...
│   ├── bench.zig
│   └── run.zig
└── build.zig

The project is powered by build.zig, which defines several commands:

  1. Build
    • zig build - Builds all of the binaries for all optimisation modes.
  2. Run
    • zig build run - Runs all solutions sequentially.
    • zig build run -Day=XX - Runs the solution of the specified day only.
  3. Benchmark
    • zig build bench - Runs all benchmarks sequentially.
    • zig build bench -Day=XX - Runs the benchmark of the specified day only.
  4. Test
    • zig build test - Runs all tests sequentially.
    • zig build test -Day=XX - Runs the tests of the specified day only.

You can also pass the optimisation mode that you want to any of the commands above with the -Doptimize flag.

Under the hood, build.zig compiles src/run.zig when you call zig build run, and src/bench.zig when you call zig build bench. These files are templates that import the solution for a specific day from src/days/dayXX.zig. For example, here’s what src/run.zig looks like:

src/run.zig
const std = @import("std");
const puzzle = @import("day"); // Injected by build.zig

pub fn main() !void {
    var arena = std.heap.ArenaAllocator.init(std.heap.page_allocator);
    defer arena.deinit();
    const allocator = arena.allocator();

    std.debug.print("{s}\n", .{puzzle.title});
    _ = try puzzle.run(allocator, true);
    std.debug.print("\n", .{});
}

The day module imported is an anonymous import dynamically injected by build.zig during compilation. This allows a single run.zig or bench.zig to be reused for all solutions. This avoids repeating boilerplate code in the solution files. Here’s a simplified version of my build.zig file that shows how this works:

build.zig
const std = @import("std");

pub fn build(b: *std.Build) void {
    const target = b.standardTargetOptions(.{});
    const optimize = b.standardOptimizeOption(.{});

    const run_all = b.step("run", "Run all days");
    const day_option = b.option(usize, "ay", ""); // The `-Day` option

    // Generate build targets for all 25 days.
    for (1..26) |day| {
        const day_zig_file = b.path(b.fmt("src/days/day{d:0>2}.zig", .{day}));

        // Create an executable for running this specific day.
        const run_exe = b.addExecutable(.{
            .name = b.fmt("run-day{d:0>2}", .{day}),
            .root_source_file = b.path("src/run.zig"),
            .target = target,
            .optimize = optimize,
        });

        // Inject the day-specific solution file as the anonymous module `day`.
        run_exe.root_module.addAnonymousImport("day", .{ .root_source_file = day_zig_file });

        // Install the executable so it can be run.
        b.installArtifact(run_exe);

        // ...
    }
}

My actual build.zig has some extra code that builds the binaries for all optimisation modes.

This setup is pretty barebones. I’ve seen other templates do cool things like scaffold files, download puzzle inputs, and even submit answers automatically. Since I wrote my build.zig after the event ended, I didn’t get to use it while solving the puzzles. I might add these features to it if I decided to do Advent of Code again this year with Zig.

Self-Imposed Constraints

While there are no rules to Advent of Code itself, to make things a little more interesting, I set a few constraints and rules for myself:

  1. The code must be readable. By “readable”, I mean the code should be straightforward and easy to follow. No unnecessary abstractions. I should be able to come back to the code months later and still understand (most of) it.
  2. Solutions must be a single file. No external dependencies. No shared utilities module. Everything needed to solve the puzzle should be visible in that one solution file.
  3. The total runtime must be under one second. [2] [2] All solutions, when run sequentially, should finish in under one second. I want to improve my performance engineering skills.
  4. Parts should be solved separately. This means: (1) no solving both parts simultaneously, and (2) no doing extra work in part one that makes part two faster. The aim of this is to get a clear idea of how long each part takes on its own.
  5. No concurrency or parallelism. Solutions must run sequentially on a single thread. This keeps the focus on the efficiency of the algorithm. I can’t speed up slow solutions by using multiple CPU cores.
  6. No ChatGPT. No Claude. No AI help. I want to train myself, not the LLM. I can look at other people’s solutions, but only after I have given my best effort at solving the problem.
  7. Follow the constraints of the input file. The solution doesn’t have to work for all possible scenarios, but it should work for all valid inputs. If the input file only contains 8-bit unsigned integers, the solution doesn’t have to handle larger integer types.
  8. Hardcoding is allowed. For example: size of the input, number of rows and columns, etc. Since the input is known at compile-time, we can skip runtime parsing and just embed it into the program using Zig’s @embedFile.

Most of these constraints are designed to push me to write clearer, more performant code. I also wanted my code to look like it was taken straight from TigerBeetle’s codebase (minus the assertions). [3] [3] Lastly, I just thought it would make the experience more fun.

Favourite Puzzles

From all of the puzzles, here are my top 3 favourites:

  1. Day 6: Guard Gallivant - This is my slowest day (in benchmarks), but also the one I learned the most from. Some of these learnings include: using vectors to represent directions, padding 2D grids, metadata packing, system endianness, etc.
  2. Day 17: Chronospatial Computer - I love reverse engineering puzzles. I used to do a lot of these in CTFs during my university days. The best thing I learned from this day is the realisation that we can use different integer bases to optimise data representation. This helped improve my runtimes in the later days 22 and 23.
  3. Day 21: Keypad Conundrum - This one was fun. My gut told me that it can be solved greedily by always choosing the best move. It was right. Though I did have to scroll Reddit for a bit to figure out the step I was missing, which was that you have to visit the farthest keypads first. This is also my longest solution file (almost 400 lines) because I hardcoded the best-moves table.

Honourable mention:

  1. Day 24: Crossed Wires - Another reverse engineering puzzle. Confession: I didn’t solve this myself during the event. After 23 brutal days, my brain was too tired, so I copied a random Python solution from Reddit. When I retried it later, it turned out to be pretty fun. I still couldn’t find a solution I was satisfied with though.

Programming Patterns and Zig Tricks

During the event, I learned a lot about Zig and performance, and also developed some personal coding conventions. Some of these are Zig-specific, but most are universal and can be applied across languages. This section covers general programming and Zig patterns I found useful. The next section will focus on performance-related tips.

Comptime

Zig’s flagship feature, comptime, is surprisingly useful. I knew Zig uses it for generics and that people do clever metaprogramming with it, but I didn’t expect to be using it so often myself.

My main use for comptime was to generate puzzle-specific types. All my solution files follow the same structure, with a DayXX function that takes some parameters (usually the input length) and returns a puzzle-specific type, e.g.:

src/days/day01.zig
fn Day01(comptime length: usize) type {
    return struct {
        const Self = @This();
        
        left: [length]u32 = undefined,
        right: [length]u32 = undefined,

        fn init(input: []const u8) !Self {}

        // ...
    };
}

This lets me instantiate the type with a size that matches my input:

src/days/day01.zig
// Here, `Day01` is called with the size of my actual input.
pub fn run(_: std.mem.Allocator, is_run: bool) ![3]u64 {
    // ...
    const input = @embedFile("./data/day01.txt");
    var puzzle = try Day01(1000).init(input);
    // ...
}

// Here, `Day01` is called with the size of my test input.
test "day 01 part 1 sample 1" {
    var puzzle = try Day01(6).init(sample_input);
    // ...
}

This allows me to reuse logic across different inputs while still hardcoding the array sizes. Without comptime, I have to either create a separate function for all my different inputs or dynamically allocate memory because I can’t hardcode the array size.

I also used comptime to shift some computation to compile-time to reduce runtime overhead. For example, on day 4, I needed a function to check whether a string matches either "XMAS" or its reverse, "SAMX". A pretty simple function that you can write as a one-liner in Python:

example.py
def matches(pattern, target):
    return target == pattern or target == pattern[::-1]

Typically, a function like this requires some dynamic allocation to create the reversed string, since the length of the string is only known at runtime. [4] [4] For this puzzle, since the words to reverse are known at compile-time, we can do something like this:

src/days/day04.zig
fn matches(comptime word: []const u8, slice: []const u8) bool {
    var reversed: [word.len]u8 = undefined;
    @memcpy(&reversed, word);
    std.mem.reverse(u8, &reversed);
    return std.mem.eql(u8, word, slice) or std.mem.eql(u8, &reversed, slice);
}

This creates a separate function for each word I want to reverse. [5] [5] Each function has an array with the same size as the word to reverse. This removes the need for dynamic allocation and makes the code run faster. As a bonus, Zig also warns you when this word isn’t compile-time known, so you get an immediate error if you pass in a runtime value.

Optional Types

A common pattern in C is to return special sentinel values to denote missing values or errors, e.g. -1, 0, or NULL. In fact, I did this on day 13 of the challenge:

src/days/day13.zig
// We won't ever get 0 as a result, so we use it as a sentinel error value.
fn count_tokens(a: [2]u8, b: [2]u8, p: [2]i64) u64 {
    const numerator = @abs(p[0] * b[1] - p[1] * b[0]);
    const denumerator = @abs(@as(i32, a[0]) * b[1] - @as(i32, a[1]) * b[0]);
    return if (numerator % denumerator != 0) 0 else numerator / denumerator;
}

// Then in the caller, skip if the return value is 0.
if (count_tokens(a, b, p) == 0) continue;

This works, but it’s easy to forget to check for those values, or worse, to accidentally treat them as valid results. Zig improves on this with optional types. If a function might not return a value, you can return ?T instead of T. This also forces the caller to handle the null case. Unlike C, null isn’t a pointer but a more general concept. Zig treats null as the absence of a value for any type, just like Rust’s Option<T>.

The count_tokens function can be refactored to:

src/days/day13.zig
// Return null instead if there's no valid result.
fn count_tokens(a: [2]u8, b: [2]u8, p: [2]i64) ?u64 {
    const numerator = @abs(p[0] * b[1] - p[1] * b[0]);
    const denumerator = @abs(@as(i32, a[0]) * b[1] - @as(i32, a[1]) * b[0]);
    return if (numerator % denumerator != 0) null else numerator / denumerator;
}

// The caller is now forced to handle the null case.
if (count_tokens(a, b, p)) |n_tokens| {
    // logic only runs when n_tokens is not null.
}

Zig also has a concept of error unions, where a function can return either a value or an error. In Rust, this is Result<T>. You could also use error unions instead of optionals for count_tokens; Zig doesn’t force a single approach. I come from Clojure, where returning nil for an error or missing value is common.

Grid Padding

This year has a lot of 2D grid puzzles (arguably too many). A common feature of grid-based algorithms is the out-of-bounds check. Here’s what it usually looks like:

example.zig
fn dfs(map: [][]u8, position: [2]i8) u32 {
    const x, const y = position;
    
    // Bounds check here.
    if (x < 0 or y < 0 or x >= map.len or y >= map[0].len) return 0;

    if (map[x][y] == .visited) return 0;
    map[x][y] = .visited;

    var result: u32 = 1;
    for (directions) | direction| {
        result += dfs(map, position + direction);
    }
    return result;
}

This is a typical recursive DFS function. After doing a lot of this, I discovered a nice trick that not only improves code readability, but also its performance. The trick here is to pad the grid with sentinel characters that mark out-of-bounds areas, i.e. add a border to the grid.

Here’s an example from day 6:

Original map:               With borders added:
                            ************
....#.....                  *....#.....*
.........#                  *.........#*
..........                  *..........*
..#.......                  *..#.......*
.......#..        ->        *.......#..*
..........                  *..........*
.#..^.....                  *.#..^.....*
........#.                  *........#.*
#.........                  *#.........*
......#...                  *......#...*
                            ************

You can use any value for the border, as long as it doesn’t conflict with valid values in the grid. With the border in place, the bounds check becomes a simple equality comparison:

example.zig
const border = '*';

fn dfs(map: [][]u8, position: [2]i8) u32 {
    const x, const y = position;
    if (map[x][y] == border) { // We are out of bounds
        return 0;
    }
    // ...
}

This is much more readable than the previous code. Plus, it’s also faster since we’re only doing one equality check instead of four range checks.

That said, this isn’t a one-size-fits-all solution. This only works for algorithms that traverse the grid one step at a time. If your logic jumps multiple tiles, it can still go out of bounds (except if you increase the width of the border to account for this). This approach also uses a bit more memory than the regular approach as you have to store more characters.

SIMD Vectors

This could also go in the performance section, but I’m including it here because the biggest benefit I get from using SIMD in Zig is the improved code readability. Because Zig has first-class support for vector types, you can write elegant and readable code that also happens to be faster.

If you’re not familiar with vectors, they are a special collection type used for Single instruction, multiple data (SIMD) operations. SIMD allows you to perform computation on multiple values in parallel using only a single CPU instruction, which often leads to some performance boosts. [6] [6]

I mostly use vectors to represent positions and directions, e.g. for traversing a grid. Instead of writing code like this:

example.zig
next_position = .{ position[0] + direction[0], position[1] + direction[1] };

You can represent position and direction as 2-element vectors and write code like this:

example.zig
next_position = position + direction;

This is much nicer than the previous version!

Day 25 is another good example of a problem that can be solved elegantly using vectors:

src/days/day25.zig
var result: u64 = 0;
for (self.locks.items) |lock| { // lock is a vector
    for (self.keys.items) |key| { // key is also a vector
        const fitted = lock + key > @as(@Vector(5, u8), @splat(5));
        const is_overlap = @reduce(.Or, fitted);
        result += @intFromBool(!is_overlap);
    }
}

Expressing the logic as vector operations makes the code cleaner since you don’t have to write loops and conditionals as you typically would in a traditional approach.

Performance Tips

The tips below are general performance techniques that often help, but like most things in software engineering, “it depends”. These might work 80% of the time, but performance is often highly context-specific. You should benchmark your code instead of blindly following what other people say.

This section would’ve been more fun with concrete examples, step-by-step optimisations, and benchmarks, but that would’ve made the post way too long. Hopefully, I’ll get to write something like that in the future. [7] [7]

Minimise Allocations

Whenever possible, prefer static allocation. Static allocation is cheaper since it just involves moving the stack pointer vs dynamic allocation which has more overhead from the allocator machinery. That said, it’s not always the right choice since it has some limitations, e.g. stack size is limited, memory size must be compile-time known, its lifetime is tied to the current stack frame, etc.

If you need to do dynamic allocations, try to reduce the number of times you call the allocator. The number of allocations you do matters more than the amount of memory you allocate. More allocations mean more bookkeeping, synchronisation, and sometimes syscalls.

A simple but effective way to reduce allocations is to reuse buffers, whether they’re statically or dynamically allocated. Here’s an example from day 10. For each trail head, we want to create a set of trail ends reachable from it. The naive approach is to allocate a new set every iteration:

src/days/day10.zig
for (self.trail_heads.items) |trail_head| {
    var trail_ends = std.AutoHashMap([2]u8, void).init(self.allocator);
    defer trail_ends.deinit();
    
    // Set building logic...
}

What you can do instead is to allocate the set once before the loop. Then, each iteration, you reuse the set by emptying it without freeing the memory. For Zig’s std.AutoHashMap, this can be done using the clearRetainingCapacity method:

src/days/day10.zig
var trail_ends = std.AutoHashMap([2]u8, void).init(self.allocator);
defer trail_ends.deinit();

for (self.trail_heads.items) |trail_head| {
    trail_ends.clearRetainingCapacity();
    
    // Set building logic...
}

If you use static arrays, you can also just overwrite existing data instead of clearing it.

A step up from this is to reuse multiple buffers. The simplest form of this is to reuse two buffers, i.e. double buffering. Here’s an example from day 11:

src/days/day11.zig
// Initialise two hash maps that we'll alternate between.
var frequencies: [2]std.AutoHashMap(u64, u64) = undefined;
for (0..2) |i| frequencies[i] = std.AutoHashMap(u64, u64).init(self.allocator);
defer for (0..2) |i| frequencies[i].deinit();

var id: usize = 0;
for (self.stones) |stone| try frequencies[id].put(stone, 1);

for (0..n_blinks) |_| {
    var old_frequencies = &frequencies[id % 2];
    var new_frequencies = &frequencies[(id + 1) % 2];
    id += 1;

    defer old_frequencies.clearRetainingCapacity();

    // Do stuff with both maps...
}

Here we have two maps to count the frequencies of stones across iterations. Each iteration will build up new_frequencies with the values from old_frequencies. Doing this reduces the number of allocations to just 2 (the number of buffers). The tradeoff here is that it makes the code slightly more complex.

Make Your Data Smaller

A performance tip people say is to have “mechanical sympathy”. Understand how your code is processed by your computer. An example of this is to structure your data so it works better with your CPU. For example, keep related data close in memory to take advantage of cache locality.

Reducing the size of your data helps with this. Smaller data means more of it can fit in cache. One way to shrink your data is through bit packing. This depends heavily on your specific data, so you’ll need to use your judgement to tell whether this would work for you. I’ll just share some examples that worked for me.

The first example is in day 6 part two, where you have to detect a loop, which happens when you revisit a tile from the same direction as before. To track this, you could use a map or a set to store the tiles and visited directions. A more efficient option is to store this direction metadata in the tile itself.

There are only four tile types, which means you only need two bits to represent the tile types as an enum. If the enum size is one byte, here’s what the tiles look like in memory:

.obstacle -> 00000000
.path     -> 00000001
.visited  -> 00000010
.path     -> 00000011

As you can see, the upper six bits are unused. We can store the direction metadata in the upper four bits. One bit for each direction. If a bit is set, it means that we’ve already visited the tile in this direction. Here’s an illustration of the memory layout:

        direction metadata   tile type
           ┌─────┴─────┐   ┌─────┴─────┐
┌────────┬─┴─┬───┬───┬─┴─┬─┴─┬───┬───┬─┴─┐
│ Tile:  │ 1 │ 0 │ 0 │ 0 │ 0 │ 0 │ 1 │ 0 │
└────────┴─┬─┴─┬─┴─┬─┴─┬─┴───┴───┴───┴───┘
   up bit ─┘   │   │   └─ left bit
    right bit ─┘ down bit

If your language supports struct packing, you can express this layout directly: [8] [8]

src/days/day06.zig
const Tile = packed struct(u8) {
    const TileType = enum(u4) { obstacle, path, visited, exit };

    up: u1 = 0,
    right: u1 = 0,
    down: u1 = 0,
    left: u1 = 0,
    tile: TileType,

    // ...
}

Doing this avoids extra allocations and improves cache locality. Since the directions metadata is colocated with the tile type, all of them can fit together in cache. Accessing the directions just requires some bitwise operations instead of having to fetch them from another region of memory.

Another way to do this is to represent your data using alternate number bases. Here’s an example from day 23. Computers are represented as two-character strings made up of only lowercase letters, e.g. "bc", "xy", etc. Instead of storing this as a [2]u8 array, you can convert it into a base-26 number and store it as a u16. [9] [9]

Here’s the idea: map 'a' to 0, 'b' to 1, up to 'z' as 25. Each character in the string becomes a digit in the base-26 number. For example, "bc" ( [2]u8{ 'b', 'c' }) becomes the base-10 number 28 (1×26+2=28). If we represent this using the base-64 character set, it becomes 12 ('b' = 1, 'c' = 2).

While they take the same amount of space (2 bytes), a u16 has some benefits over a [2]u8:

  1. It fits in a single register, whereas you need two for the array.
  2. Comparison is faster as there is only a single value to compare.

Reduce Branching

I won’t explain branchless programming here; Algorithmica explains it way better than I can. While modern compilers are often smart enough to compile away branches, they don’t catch everything. I still recommend writing branchless code whenever it makes sense. It also has the added benefit of reducing the number of codepaths in your program.

Again, since performance is very context-dependent, I’ll just show you some patterns I use. Here’s one that comes up often:

src/days/day02.zig
if (is_valid_report(report)) {
    result += 1;
}

Instead of the branch, cast the bool into an integer directly:

src/days/day02.zig
result += @intFromBool(is_valid_report(report))

Another example is from day 6 (again!). Recall that to know if a tile has been visited from a certain direction, we have to check its direction bit. Here’s one way to do it:

src/days/day06.zig
fn has_visited(tile: Tile, direction: Direction) bool {
    switch (direction) {
        .up => return self.up == 1,
        .right => return self.right == 1,
        .down => return self.down == 1,
        .left => return self.left == 1,
    }
}

This works, but it introduces a few branches. We can make it branchless using bitwise operations:

src/days/day06.zig
fn has_visited(tile: Tile, direction: Direction) bool {
    const int_tile = std.mem.nativeToBig(u8, @bitCast(tile));
    const mask = direction.mask();
    const bits = int_tile & 0xff; // Get only the direction bits
    return bits & mask == mask;
}

While this is arguably cryptic and less readable, it does perform better than the switch version.

Avoid Recursion

The final performance tip is to prefer iterative code over recursion. Recursive functions bring the overhead of allocating stack frames. While recursive code is more elegant, it’s also often slower unless your language’s compiler can optimise it away, e.g. via tail-call optimisation. As far as I know, Zig doesn’t have this, though I might be wrong.

Recursion also has the risk of causing a stack overflow if the execution isn’t bounded. This is why code that is mission- or safety-critical avoids recursion entirely. It’s in TigerBeetle’s TIGERSTYLE and also NASA’s Power of Ten.

Iterative code can be harder to write in some cases, e.g. DFS maps naturally to recursion, but most of the time it is significantly faster, more predictable, and safer than the recursive alternative.

Benchmarks

I ran benchmarks for all 25 solutions in each of Zig’s optimisation modes. You can find the full results and the benchmark script in my GitHub repository. All benchmarks were done on an Apple M3 Pro.

As expected, ReleaseFast produced the best result with a total runtime of 85.1 ms. I’m quite happy with this, considering the two constraints that limited the number of optimisations I can do to the code:

  • Parts should be solved separately - Some days can be solved in a single go, e.g. day 10 and day 13, which could’ve saved a few milliseconds.
  • No concurrency or parallelism - My slowest days are the compute-heavy days that are very easily parallelisable, e.g. day 6, day 19, and day 22. Without this constraint, I can probably reach sub-20 milliseconds total(?), but that’s for another time.

You can see the full benchmarks for ReleaseFast in the table below:

Day Title Parsing (µs) Part 1 (µs) Part 2 (µs) Total (µs)
1 Historian Hysteria 23.5 15.5 2.8 41.8
2 Red-Nosed Reports 42.9 0.0 11.5 54.4
3 Mull it Over 0.0 7.2 16.0 23.2
4 Ceres Search 5.9 0.0 0.0 5.9
5 Print Queue 22.3 0.0 4.6 26.9
6 Guard Gallivant 14.0 25.2 24,331.5 24,370.7
7 Bridge Repair 72.6 321.4 9,620.7 10,014.7
8 Resonant Collinearity 2.7 3.3 13.4 19.4
9 Disk Fragmenter 0.8 12.9 137.9 151.7
10 Hoof It 2.2 29.9 27.8 59.9
11 Plutonian Pebbles 0.1 43.8 2,115.2 2,159.1
12 Garden Groups 6.8 164.4 249.0 420.3
13 Claw Contraption 14.7 0.0 0.0 14.7
14 Restroom Redoubt 13.7 0.0 0.0 13.7
15 Warehouse Woes 14.6 228.5 458.3 701.5
16 Reindeer Maze 12.6 2,480.8 9,010.7 11,504.1
17 Chronospatial Computer 0.1 0.2 44.5 44.8
18 RAM Run 35.6 15.8 33.8 85.2
19 Linen Layout 10.7 11,890.8 11,908.7 23,810.2
20 Race Condition 48.7 54.5 54.2 157.4
21 Keypad Conundrum 0.0 1.7 22.4 24.2
22 Monkey Market 20.7 0.0 11,227.7 11,248.4
23 LAN Party 13.6 22.0 2.5 38.2
24 Crossed Wires 5.0 41.3 14.3 60.7
25 Code Chronicle 24.9 0.0 0.0 24.9

A weird thing I found when benchmarking is that for day 6 part two, ReleaseSafe actually ran faster than ReleaseFast (13,189.0 µs vs 24,370.7 µs). Their outputs are the same, but for some reason, ReleaseSafe is faster even with the safety checks still intact.

The Zig compiler is still very much a moving target, so I don’t want to dig too deep into this, as I’m guessing this might be a bug in the compiler. This weird behaviour might just disappear after a few compiler version updates.

Reflections

Looking back, I’m really glad I decided to do Advent of Code and followed through to the end. I learned a lot of things. Some are useful in my professional work, some are more like random bits of trivia. Going with Zig was a good choice too. The language is small, simple, and gets out of your way. I learned more about algorithms and concepts than the language itself.

Besides what I’ve already mentioned earlier, here are some examples of the things I learned:

Some of my self-imposed constraints and rules ended up being helpful. I can still (mostly) understand the code I wrote a few months ago. Putting all of the code in a single file made it easier to read since I don’t have to context switch to other files all the time.

However, some of them did backfire a bit, e.g. the two constraints that limit how I can optimise my code. Another one is the “hardcoding allowed” rule. I used a lot of magic numbers, which helped to improve performance, but I didn’t document them, so after a while, I don’t even remember how I got them. I’ve since gone back and added explanations in my write-ups, but next time I’ll remember to at least leave comments.

One constraint I’ll probably remove next time is the no concurrency rule. It’s the biggest contributor to the total runtime of my solutions. I don’t do a lot of concurrent programming, even though my main language at work is Go, so next time it might be a good idea to use Advent of Code to level up my concurrency skills.

I also spent way more time on these puzzles than I originally expected. I optimised and rewrote my code multiple times. I also rewrote my write-ups a few times to make them easier to read. This is by far my longest side project yet. It’s a lot of fun, but it also takes a lot of time and effort. I almost gave up on the write-ups (and this blog post) because I don’t want to explain my awful day 15 and day 16 code. I ended up taking a break for a few months before finishing it, which is why this post is published in August lol.

Just for fun, here’s a photo of some of my notebook sketches that helped me visualise my solutions. See if you can guess which days these are from:

Photos of my notebook sketches

What’s Next?

So… would I do it again? Probably, though I’m not making any promises. If I do join this year, I’ll probably stick with Zig. I had my eyes on Zig since the start of 2024, so Advent of Code was the perfect excuse to learn it. This year, there aren’t any languages in particular that caught my eye, so I’ll just keep using Zig, especially since I have a proper setup ready.

If you haven’t tried Advent of Code, I highly recommend checking it out this year. It’s a great excuse to learn a new language, improve your problem-solving skills, or just learn something new. If you’re eager, you can also do the previous years’ puzzles as they’re still available.

One of the best aspects of Advent of Code is the community. The Advent of Code subreddit is a great place for discussion. You can ask questions and also see other people’s solutions. Some people also post really cool visualisations like this one. They also have memes!


  1. I failed my first attempt horribly with Clojure during Advent of Code 2023. Once I reached the later half of the event, I just couldn’t solve the problems with a purely functional style. I could’ve pushed through using imperative code, but I stubbornly chose not to and gave up… ↩︎

  2. The original constraint was that each solution must run in under one second. As it turned out, the code was faster than I expected, so I increased the difficulty. ↩︎

  3. TigerBeetle’s code quality and engineering principles are just wonderful. ↩︎

  4. You can implement this function without any allocation by mutating the string in place or by iterating over it twice, which is probably faster than my current implementation. I kept it as-is as a reminder of what comptime can do. ↩︎

  5. As a bonus, I was curious as to what this looks like compiled, so I listed all the functions in this binary in GDB and found:

    72: static bool day04.Day04(140).matches__anon_19741;
    72: static bool day04.Day04(140).matches__anon_19750;

    It does generate separate functions! ↩︎

  6. Well, not always. The number of SIMD instructions depends on the machine’s native SIMD size. If the length of the vector exceeds it, Zig will compile it into multiple SIMD instructions. ↩︎

  7. Here’s a nice post on optimising day 9’s solution with Rust. It’s a good read if you’re into performance engineering or Rust techniques. ↩︎

  8. One thing about packed structs is that their layout is dependent on the system endianness. Most modern systems are little-endian, so the memory layout I showed is actually reversed. Thankfully, Zig has some useful functions to convert between endianness like std.mem.nativeToBig, which makes working with packed structs easier. ↩︎

  9. Technically, you can store 2-digit base 26 numbers in a u10, as there are only 262 possible numbers. Most systems usually pad values by byte size, so u10 will still be stored as u16, which is why I just went straight for it. ↩︎

Permalink

The tools of an Agentic Engineer

A lot of great things have origins from the 1970s: Hip Hop redefining music and street culture, Bruce Lee was taking Martial Arts to the next level and the initial development of something called editor macros (also known as Emacs) was happening. I was born in that decade, but that's purely coincidence.

My choice of primary development tool since a couple of years back is that editor from the seventies. It is my choice of development for Python, JavaScript, TypeScript and Lisp dialects such as Clojure and elisp. And today, as an agentic engineer, it turned out to be a great choice for this kind of software development too. With the rise of various CLI, TUI & Desktop based tools for AI development, it would be reasonable to think that this ancient code editor would become obsolete - right?

Not if you knew about the innovative Emacs community. It is driven by passion, support from the community itself and Open Source. These components are usually more resilient and reliable long term than the VC driven startup culture. Emacs is part of the greater Lisp community, where a lot of innovations in general take place. The Clojure community is cutting edge in many aspects of software development including AI.

More Agents

One thing that I have noticed lately is that the more I get into Agentic Engineering, the more I use Emacs. When the focus has shifted from typing code to instruct and review, I have found use of Emacs powers I haven't really needed until now. Tools like Magit (git) and I'm also learning more about the powerful Org Mode. I didn't care that much about Markdown before, but now it is an important part of the development itself. So I just configured my Emacs to have a nice-looking, simple and readable markdown experience.

"More Agentic Engineering, More Emacs"

With Emacs, I use a great AI-tool called Eca and with it I am not limited to any specific vendor for agentic development. Vendor lock-in is something I really want to avoid. The combination of Eca and the power tools mentioned before, makes a very nice Agentic Engineering toolset. Eca is actively developed and has a lot of useful features and a very nice developer experience. It supports standards like AGENTS.md, commands, skills, hooks, sub-agents and use a client-server setup in the same way as the language server protocol. It is Open Source and not only for Emacs. Have a look at the website for support of your favorite editor or IDE. By the way, Eca is developed in Lisp (Clojure).

I have my Eca-setup shared at GitHub, and have also some contributions to the Eca plugins repository.

Human Driven Development

With this setup, the human reviewing can happen in real time, and doesn't have to wait until the end where the amount of code too often is quite overwhelming. The human developer (that's me) can quickly act when noticing that things takes a different route than expected, in a similar way as the stop-the-line principle from the Toyota Way. This is a lean way to reach the end goal quickly: deploying code that is good enough for production and adds value.

I have found that many Agile practices in combination with developer friendly tools fits well with the ideas of Agentic Engineering. Even though I've seen worrying signs of a return of the Waterfall movement.

To summarize: the result of my new Agentic Engineering development-style is that I haven't put my IDE to the side - it's at the very Center of the agentic workflow.



Top Photo by me, taken at Åreskutan, Jämtland, Sweden.

Permalink

Senior Product Engineer (Frontend/ClojureScript) at Pitch Software GmbH

Senior Product Engineer (Frontend/ClojureScript) at Pitch Software GmbH


The Role

We're looking for a senior engineer with deep ClojureScript expertise to work directly with our CTO and leadership team on high-impact technical initiatives.

This role spans cross-team work that pushes the boundaries of what's possible: accelerating product innovation through AI-assisted development, shaping our product's future through rapid experimentation, and shipping delightful, performant software at scale.

What You’ll Do

  • Drive hands-on work on high-priority initiatives across the product
  • Partner with leadership to design and implement technically complex projects
  • Review and refine significant changes with an eye toward clarity, performance, and long-term maintainability
  • Evolve our shared systems, tooling, and frontend architecture in ClojureScript
  • Help maintain consistency in our engineering patterns, abstractions, and product quality
  • Collaborate closely with design and product to ensure technical decisions enhance the user experience

Requirements

  • Strong production experience building systems in ClojureScript
  • Deep understanding of how AI agents can be integrated into the development lifecycle — from requirements and code generation to testing, debugging, and deployment — while maintaining appropriate human oversight
  • An appreciation for simplicity in complex systems
  • Experience with frontend architecture at scale
  • Clear, thoughtful communication style
  • Strong engineering craft: readable code, thoughtful abstractions, pragmatic trade-offs
  • Comfort working collaboratively across teams

Nice to Haves

  • Experience working on design-driven products
  • Familiarity with functional programming beyond ClojureScript
  • Experience contributing to shared frontend libraries or design systems
  • Experience working in distributed teams

Permalink

How I Cut My AI Coding Agent's Token Usage by 120x With a Code Knowledge Graph

AI coding agents are powerful — but they're also blind. Every time Claude Code, Codex, or Gemini CLI needs to understand your codebase, they explore it file by file. Grep here, read there, grep again. For a simple question like "what calls ProcessOrder?", an agent might burn through 45,000 tokens just opening files and scanning for matches.

I built codebase-memory-mcp to fix this. It parses your codebase into a persistent knowledge graph — functions, classes, call chains, imports, HTTP routes — and exposes it through 14 MCP tools. The same question now costs ~200 tokens and answers in under 1ms.

The Problem: File-by-File Exploration Doesn't Scale

Here's what actually happens when you ask an AI agent "trace the callers of ProcessOrder":

  1. Agent greps for ProcessOrder across all files (~15,000 tokens)
  2. Reads each matching file to understand context (~25,000 tokens)
  3. Follows imports to find indirect callers (~20,000 tokens)
  4. Gives up after hitting context limits, missing half the call chain

Multiply this by every question in a coding session and you're burning hundreds of thousands of tokens per hour — most of it reading files that aren't relevant.

The Fix: Parse Once, Query Forever

codebase-memory-mcp runs a one-time indexing pass using tree-sitter AST parsing. It extracts every function, class, method, import, call relationship, and HTTP route into a SQLite-backed graph. After that, the graph stays fresh automatically via a background watcher that detects file changes.

You: "what calls ProcessOrder?"

Agent calls: trace_call_path(function_name="ProcessOrder", direction="inbound")

→ Returns structured call chain in ~200 tokens, <1ms

No LLM is embedded in the server. Your agent is the intelligence layer — it just gets precise structural answers instead of raw file contents.

Benchmarks: 120x Token Reduction

I ran agent-vs-agent testing across 31 languages (372 questions). Five representative structural queries on a real multi-service project:

Query Type Knowledge Graph File-by-File Search Savings
Find function by pattern ~200 tokens ~45,000 tokens 225x
Trace call chain (depth 3) ~800 tokens ~120,000 tokens 150x
Dead code detection ~500 tokens ~85,000 tokens 170x
List all HTTP routes ~400 tokens ~62,000 tokens 155x
Architecture overview ~1,500 tokens ~100,000 tokens 67x
Total ~3,400 ~412,000 121x

That's a 99.2% reduction. The cost difference between graph queries and file exploration adds up fast over a full development session.

It Handles the Linux Kernel

The stress test I'm most proud of: indexing the entire Linux kernel.

  • 28 million lines of code, 75,000 files
  • 2.1 million nodes, 4.9 million edges
  • Indexed in 1 minute on an M3 Pro in fast mode, 5mins for advanced indexing also including large files and digging a bit more. The average repo will be indexed sub second depending on your hardware (more cpu the better).

The pipeline is RAM-first: LZ4-compressed bulk read, in-memory SQLite, fused Aho-Corasick pattern matching, single dump at the end. Memory is released back to the OS after indexing completes. Average-sized repos index in milliseconds.

64 Languages, Zero Dependencies

All 64 language grammars are vendored as C source and compiled into a single static binary. Nothing to install, nothing that breaks when tree-sitter updates a grammar upstream.

Programming languages (39): Python, Go, JavaScript, TypeScript, TSX, Rust, Java, C++, C#, C, PHP, Ruby, Kotlin, Scala, Swift, Dart, Zig, Elixir, Haskell, OCaml, Objective-C, Lua, Bash, Perl, Groovy, Erlang, R, Clojure, F#, Julia, Vim Script, Nix, Common Lisp, Elm, Fortran, CUDA, COBOL, Verilog, Emacs Lisp

Scientific (5): MATLAB, Lean 4, FORM, Magma, Wolfram

Config/markup (20): HTML, CSS, SCSS, YAML, TOML, HCL, SQL, Dockerfile, JSON, XML, Markdown, Makefile, CMake, Protobuf, GraphQL, Vue, Svelte, Meson, GLSL, INI

This matters because real-world codebases aren't monolingual. A typical project has Go backends, TypeScript frontends, SQL migrations, Dockerfiles, YAML configs, and shell scripts. One indexing pass captures all of it. We also already introduced more advanced indexing using LSP like techniques, basically creating a "LSP + Tree-Sitter" hybrid approach. Currently only supported for Go, C and C++, more supported languages coming soon.

14 MCP Tools

The full tool surface:

Tool What it does
search_graph Find functions/classes by name pattern, label, degree
trace_call_path Follow callers/callees at configurable depth
get_architecture Languages, packages, entry points, routes, hotspots, clusters
detect_changes Map git diff to affected symbols with risk classification
query_graph Raw Cypher queries (MATCH (f:Function)-[:CALLS]->(g)...)
search_code Full-text search across indexed source
get_code_snippet Read a specific function/class by qualified name
get_graph_schema Inspect available node/edge types
manage_adr Architecture Decision Records that persist across sessions
index_repository Trigger initial index (auto-sync handles the rest)
list_projects Show all indexed repos with stats
delete_project Clean up a project's graph data
index_status Check indexing progress
ingest_traces Import OpenTelemetry traces into the graph

Works With 8 AI Agents

One install command auto-detects and configures all of these:

  • Claude Code — full integration with skills + PreToolUse hooks
  • Codex CLI — MCP config + AGENTS.md instructions
  • Gemini CLI — MCP config + BeforeTool hooks
  • Zed — JSONC settings integration
  • OpenCode — MCP config + AGENTS.md
  • Antigravity — MCP config + AGENTS.md
  • Aider — CONVENTIONS.md instructions
  • KiloCode — MCP settings + rules

The hooks are advisory — they remind agents to check the graph before reaching for grep/glob/read, without blocking anything.

Setup: 3 Commands

# Download (or use the one-liner: curl -fsSL https://raw.githubusercontent.com/DeusData/codebase-memory-mcp/main/scripts/setup.sh | bash)
tar xzf codebase-memory-mcp-*.tar.gz
mv codebase-memory-mcp ~/.local/bin/

# Auto-configure all detected agents
codebase-memory-mcp install

# Restart your agent, then:
"Index this project"

That's it. No Docker, no API keys, no npm install, no runtime dependencies. A ~15MB static binary for macOS (arm64/amd64), Linux (arm64/amd64), or Windows (amd64).

Built-In Graph Visualization

If you download the UI variant, you get a 3D interactive graph explorer at localhost:9749:

Graph visualization showing codebase knowledge graph with nodes and edges

It runs as a background thread alongside the MCP server — available whenever your agent is connected.

The Design Philosophy

A lot of code intelligence tools embed an LLM for natural language → graph query translation. This means extra API keys, extra cost, and another model to configure and keep updated.

With MCP, the agent you're already talking to is the query translator. It reads tool descriptions, understands your question, and makes the right tool call. No intermediate LLM needed.

Similarly, the tool focuses on structural precision over semantic fuzziness. When an agent asks "what calls X?", it needs an exact answer — not a ranked list of "probably related" functions. The graph gives exact call chains with import-aware, type-inferred resolution.

What's Next

  • LSP-style hybrid type resolution — already live for Go, C, and C++ (more languages coming)
  • Cross-service HTTP linking — discovers REST routes and matches them to HTTP call sites with confidence scoring
  • Louvain community detection — automatically discovers functional modules by clustering call edges

Try It

If you're burning tokens on file-by-file exploration, give it a shot. Index your project and ask your agent a structural question — the difference is immediate.

Built with pure C, tree-sitter, and SQLite. No runtime dependencies. 780+ stars and growing. We built it for developers using coding agents. We want to reach here the most performant solution in this space as we believe it will enable more efficient coding for everyone and vice versa will translate in more good solutions coming up, faster and cheaper in token burn

Permalink

Side-stepping the Secretary Problem, unwittingly.

Having played both parts in the kabuki play that is employee-employer matchmaking, I feel the way we play it is a zero-sum game. I wish it were not so. When this post started life in 2024, as a wall of text chat message, it was brutal out there, on both sides of the software industry interview table. The ZIRP had ended. As of 2026, post-ZIRP reality has properly set in and remains bad ("AI" is a Fig Leaf (Enterprise Edition) for structural damage they self-inflicted, and if you look at Hyperscaler GPU depreciation schedules, they are making it order-of-magnitude worse). Set to that backdrop, here is a hopefully hopeful hiring anecdote where I think we avoided the so-called "Secretary Problem", framed within Optimal Stopping Theory. It can be done. Non-zero-sum hiring ought to be default-mode for any industry, AI or no AI.

Permalink

Clojure Deref (Mar 18, 2026)

Welcome to the Clojure Deref! This is a weekly link/news roundup for the Clojure ecosystem (feed: RSS).

Clojure Data Science Survey

Do you use clojure for Data Science? Please take the survey. Your responses will help shape the future of the Noj toolkit and the Data Science ecosystem in Clojure.

Clojurists Together: Call for Proposals

Clojurists Together has opened the Q2 2026 funding round for open-source Clojure projects. Applications will be accepted through March 19th.

Read the announcement for more details.

Upcoming Events

Libraries and Tools

Debut release

  • spel - Idiomatic Clojure wrapper for Playwright. Browser automation, API testing, Allure reporting, and native CLI - for Chromium, Firefox, and WebKit

  • pants_backend_clojure - Pants build tool backend for Clojure

  • edgarjure - Clojure library for accessing SEC EDGAR filings — company lookup, filing content, XBRL financials, and NLP item extraction via SEC’s public APIs

  • aimee - Aimee is a Clojure library for streaming and non-streaming OpenAI compatible Chat Completions over core.async channels.

  • text-diff - Line-level text diffing for Clojure, ClojureScript and babashka

  • livewire - Embedded nREPL wire into a running Spring Boot app — giving AI agents and humans a live probe into the JVM. Inspect beans, trace SQL, detect N+1s, and hot-swap @Query annotations. Zero restarts.

  • rewrite-json - A Clojure library for format-preserving JSON and JSONC editing

  • clj-figlet - A native Clojure re-implementation of FIGlet — the classic ASCII art text renderer.

Updates

Permalink

How to Get Started with Machine Learning (2026 Implementation Guide)

Moving from data collection to actual AI software development and machine learning implementation is no longer just a nice-to-have; it is how to stay in business.  

In 2026, if businesses invest in AI development services or partner with a reputable machine learning development company, they can finally turn all that raw data into AI-powered business intelligence (BI). And that actually works. 

If businesses wait, they are already behind. The competitors are busy automating their workflows, personalizing their interactions with customers, and growing faster thanks to machine learning. This guide walks through how to get started with ML in 2026 and highlights common mistakes that confuse most newcomers.

So why step in now? Three things make 2026 the year everything shifts for AI:

  • Mature ecosystems: Tools are ready to use. Platforms like AWS SageMaker, Google Vertex AI, and new options for private deployments make machine learning more accessible than ever.
  • Regulatory clarity: The rules are now clear. GDPR, CCPA, and the new AI Act lay out exactly how to use AI responsibly.
  • Competitive necessity: Third, the pressure is on. Whether it is predicting customer churn or automating paperwork, Machine learning has moved beyond trials. It is just the way business operates now.

The 5-Step Roadmap to AI Integration

Learning to use AI effectively is best approached as a journey rather than a single step. Organizations can follow five clear stages that build on one another: defining the problem, preparing the data, selecting the appropriate model, testing through a pilot, and finally scaling the solution thoughtfully and responsibly. Each stage plays a critical role. By progressing methodically, teams can avoid costly mistakes while giving their AI initiatives a strong foundation for long-term success.

The 5-Step Roadmap to AI Integration

Step ❶: Problem Definition

Start by figuring out where AI can actually help. The best projects begin with a real business pain point and a measurable goal. Do not waste time on unclear ideas- focus on a specific goal.

Some typical examples? 

  • Churn prediction for subscription businesses. 
  • Automating legal or finance documents. 
  • Use of AI to detect fraud in banks. 

These are practical cases with a clear impact. They don’t need huge datasets, complex systems, or long setup times. They work, and they show the company that AI is real.

👉 McKinsey & Company estimates that AI and analytics could add $3.5 to $5.8 trillion in value each year across industries, showing the strong ROI of well‑planned machine learning.”

A good machine learning development company can identify the right starting point, help businesses secure that early win, and lay a foundation for greater achievements in AI software development.

Step ❷: Data Audit & Preparation

Strong models depend on strong data. Before businesses build anything, take a good look at what they have.

Key questions to consider include:

  •  Is the data fragmented across multiple systems? If so, efforts should be made to break down data silos and establish unified access.
  • Are the data compatible, or have calculation methods changed over time?
  • Can teams access the data while remaining fully compliant with security, privacy, and regulatory requirements?
  • Is the data clean, structured, and consistent? This may require removing duplicates, standardizing formats, and addressing missing or incomplete values.

Structured data, such as CRM records, is typically easier to manage and analyze. However, organizations should not overlook unstructured data, including emails, PDFs, and images, which often contain valuable insights. This is where AI development services add significant value by organizing and transforming unstructured information into formats that can be effectively analyzed. Even the most advanced models cannot compensate for poor-quality data. Simply put, reliable and well-prepared data is essential for achieving meaningful AI outcomes.

Structured vs. Unstructured Data Readiness

CategoryStructured DataUnstructured Data
FormatOrganized, labeledRaw, messy, no fixed format
ExamplesCRM records, transactionsEmails, PDFs, images, videos
Ease of UseReady for ML modelsNeeds cleaning and processing
PreparationMinimal workHeavy preprocessing required
Use CasesChurn prediction, fraud detectionSentiment analysis, document automation

Step ❸: Choosing the right model

  • Not every problem requires the same AI approach. The key is selecting the method that best fits the use case.
  • When labeled data is available, and the goal is to predict a specific outcome, such as identifying customers likely to churn, supervised learning is often the most effective choice. If labeled data is not available, unsupervised learning can help uncover hidden patterns, such as grouping customers with similar behaviors.
  • For tasks involving large volumes of text, such as extracting key insights or summarizing contracts, large language models (LLMs) are particularly well-suited. Choosing the right approach ensures that AI solutions remain practical, efficient, and aligned with business objectives.

Step ❹: Pilot & MVP

Avoid deploying AI across the entire organization at once, as this can introduce unnecessary risk and complexity. Instead, begin with a minimum viable product (MVP) or a focused pilot to validate the approach and gather insights before scaling.

Start small. Test with real data. See how it performs, and gather feedback from the people who use it. That builds trust and helps convince skeptics. Privacy‑first AI software development should test in secure environments and safeguard sensitive data.

Step ❺: Scaling & Optimization

If the pilot proves successful, the next step is to scale it thoughtfully. However, AI systems cannot simply be deployed and left unattended—they require ongoing monitoring and maintenance. As business conditions and data evolve, models can drift and lose accuracy. Organizations should continuously evaluate model performance, retrain with updated data, and monitor for bias, security, or compliance concerns. When managed effectively, AI-powered business intelligence (BI) can significantly transform how organizations analyze data and make decisions.

Reports run on their own, dashboards update in real time, and decisions happen faster. AI is now at the center of operations.

Key Takeaway

Bringing AI into the business is not a one-shot deal. Start small, prove it works, and then scale up carefully. Every step a business takes cuts down risk and builds momentum. With the right AI software development and machine learning development company, AI goes from experiment to essential- and helps businesses grow in a way that is smart, safe, and aligned with the goals.

Build vs. Buy in Machine Learning

CategoryOff‑the‑Shelf AI APIsCustom AI Software Development
Data PrivacyHigh risk of leakage, Limited control over shared dataPrivacy‑first, full control
AccuracyGeneric resultsTuned to your data
CostHigh per request, low upfront costsLower long‑term
FlexibilityLimited optionsFull roadmap control
IntegrationQuick plug‑and‑playTailored to existing systems
ScalabilityMay hit usage limitsScales with your infrastructure
SupportVendor‑dependentIn‑house expertise
Speed to LaunchFast startLonger build time
OwnershipNo IP ownershipFull IP ownership
CustomizationOne‑size‑fits‑allDesigned for your needs

Essential Tools for AI Tech Stack in 2026

Core Languages

  • Python → Preferred for research and quick experiments, thanks to its vast libraries and vibrant community.
  • Clojure → It is gaining popularity in production thanks to its functional design, which boosts scalability and reliability.
  • Rust → It gets involved when companies require a high level of speed. Particularly about the larger AI sectors, it maintains the speed and security of the operations.
  • Julia → It is great if companies are very involved in math or scientific computing.

📌 Note: People really see Clojure as a solid choice for production machine learning. 

Machine learning development company Flexiana uses Clojure for a reason- it helps them create systems that actually last.

  • Functional style → Since Clojure works with immutable data, businesses get fewer unexpected side effects, leading to fewer bugs creeping in. That is a big deal when businesses are running massive operations and need to trust their systems.
  • Concurrency → When it comes to handling lots of tasks at once, Clojure does the job well. It runs on the JVM, so it handles the heavy, parallel workloads businesses see in large machine learning pipelines.
  • Python interop → Flexiana runs production systems in Clojure but still trains models in Python. With libpython-clj, Python models can run directly inside Clojure. This way, teams get Python’s rich ML ecosystem plus Clojure’s stability- the best of both worlds.
  • Maintainability → Long‑term upkeep is easier with Clojure’s clean, composable design. Clojure’s clean, composable design makes that part a lot easier, especially when businesses are not just experimenting but actually running ML in production.
  • Ecosystem fit → Flexiana already has experience with Clojure. Keeping everything in the same language just makes their whole stack neater and more consistent.

Flexiana picks Clojure because it gives us control and reliability for real-world machine learning- without giving up the flexibility of Python when we need it. It is a solid balance between trying new things and keeping everything running smoothly.

Infrastructure

  • AWS SageMaker → It covers everything- training, deploying, monitoring- all in one spot.
  • Google Vertex AI → It organizes business datasets, pipelines, and deployed models.
  • Azure ML → It is a go-to if the team is already using Microsoft tools.
  • Privacy‑first local setups → On-prem or edge- help keep sensitive data protected.
  • Hybrid models → They give businesses cloud power while letting them keep control where they need it.
  • Containers & orchestration → Tools like Docker, Kubernetes, Serverless (AWS), and Serverless (Google Cloud) endpoints keep business models portable and simple to run.

Libraries

  • Clojure → Users lean on scicloj.ml for building functional ML pipelines and also major Python libraries with libpython-clj.
  • PythonPyTorch and TensorFlow are still the kings of deep learning.
  • Specialized →For something more specialized, Hugging Face leads in NLP, RAPIDS focuses on GPU data science, and LangChain handles LLM workflows.
  • Visualization → When businesses need to see their data, Plotly and Vega stand out, and now AI-powered dashboards are appearing too.

MLOps & Tooling

  • Experiment tracking → To track experiments, MLflow and Weights & Biases get the job done.
  • MonitoringEvidently AI and Arize help businesses to keep an eye on their models.
  • Version controlDVC and Git workflows manage both data and models.
  • Pipeline automation → Automate business pipelines with Kubeflow and Airflow.
  • CI/CD for AI → If businesses want to implement CI/CD, GitHub Actions and Jenkins (with ML plugins) maintain progress.

Security & Privacy

AI has become part of everyday business operations, making data protection more important than ever. Organizations cannot afford mistakes when it comes to sensitive information or regulatory compliance. Because of this, companies are placing much greater emphasis on security and privacy when developing AI systems. Several approaches are commonly used to achieve this.

  • Federated Learning:
    Instead of sending all raw data to a central server, federated learning allows AI models to learn directly from data where it already exists. Only model updates are shared, not the actual data. This approach helps keep sensitive information private while still improving the model. It also supports compliance with privacy regulations such as GDPR and HIPAA.
  • Differential Privacy:
    Differential privacy protects individuals by introducing small amounts of random noise into datasets during analysis. This allows teams to detect useful patterns and insights without exposing personal or identifiable information.
  • Zero-Trust Architecture:
    Zero-trust security operates on the principle that no user or system is automatically trusted. Every request must be verified for identity and permission before access is granted. While strict, this model significantly reduces the risk of unauthorized access from both external threats and internal misuse.
  • Synthetic Data:
    In many situations, real data cannot be shared due to privacy restrictions. Synthetic data provides a useful alternative. It is artificially generated but designed to mimic the patterns of real datasets. This allows teams to train AI models effectively without compromising anyone’s privacy.
  • Data Consistency and Calculation Drift:
    AI systems can fail if the underlying data or calculations change unexpectedly. For example, modifying how metrics are measured or adjusting formulas can disrupt model predictions. Regular data audits help teams detect these issues early, ensuring that AI systems continue to perform reliably.

Emerging Trends Shaping AI Strategy

AI is evolving rapidly, and new technologies continue to influence how organizations design and deploy intelligent systems. Several important trends are currently shaping AI strategies.

  • Edge AI:
    Instead of relying entirely on cloud infrastructure, some AI models now run directly on devices such as smartphones, smartwatches, or other edge devices. Processing data locally reduces latency, improves response time, and enhances privacy since data does not always need to leave the device.
  • Green AI:
    Training large AI models can consume significant amounts of energy. Green AI focuses on improving efficiency by using smaller models, optimized computing techniques, and cleaner energy sources. The goal is to reduce environmental impact while also lowering operational costs.
  • AutoML (Automated Machine Learning):
    AutoML tools automate many complex machine learning tasks, such as model selection and hyperparameter tuning. This allows organizations with limited AI expertise to build effective models quickly, making AI development more accessible.
  • AI Governance:
    As AI systems become more widely used, proper oversight becomes essential. Organizations must be able to explain how their models make decisions and demonstrate that their systems operate fairly and responsibly. This involves maintaining audit trails, monitoring for bias, and clearly documenting models. Transparency is not only important for regulators but also for building trust with users and customers.
Comprehensive machine learning and Ai development stack

Addressing the Biggest Obstacle: Privacy & Compliance

Key Regulations to Know

  • GDPR (EU): Strong data laws and severe fines for errors.
  • CCPA (California): Demands clear privacy rights and transparency for consumers.
  • Right to be Forgotten: People can ask to have their data erased, no questions asked.
  • EU AI Act (2026): New rules will categorize AI systems by risk.

Privacy-First AI Software Development 

  • Start with privacy. Make compliance part of business AI from day one, not just an add-on later.
  • Collect less data. Only grab what the business really needs- avoid stockpiling.
  • Use privacy tools. Consider anonymization, encryption, or even synthetic data to protect people’s information.
  • Keep track of everything. Know exactly where business data comes from and how you are using it.

Essential Operations

  • Monitor automatically: Set up automatic monitoring to spot privacy issues as they happen.
  • Keep detailed records: Have a clear audit trail for every AI decision.
  • Explain decisions: Explain the business’s AI decisions, both for the users and for regulators. No hidden components.
  • Enable user control: Give users the ability to edit or delete their data at any time.

Preparing for the Future

  • Risk classification: High-risk AI (such as in hiring, healthcare, and law enforcement) is subject to stricter rules.
  • Human oversight: Keep humans in the loop. Big decisions need a real person to review them.
  • Global standards: Plan for global rules. Every country’s got its own standards, so avoid being unprepared.
  • Continuous updates: Stay up to date with changing regulations.

Bottom line: Make privacy and compliance part of your AI plan from the very beginning. It is much easier to build now than to rush and fix later.

Why Machine Learning with Clojure is the Secret Weapon

Concurrency

Clojure lets teams run a bunch of tasks at once without worrying about them conflicting. That is significant when teams are handling real-time data. Consider business dashboards- they stay up-to-date, even as new numbers roll in. The retail team can watch sales, inventory, and customer trends update in real time and immediately modify their marketing offers.

Stability

With Clojure, the data does not change. Once teams set it, it gets locked in. When they run experiments or build models, the results stay the same. It makes bugs easier to find and builds trust in the data.

Code comparison:

Takeaway: Python → list changes. Clojure → vector stays, new copy made.

Interoperability

Clojure is compatible with business structures since it runs on the JVM. Teams get all the benefits of functional programming, but they can still use Python’s machine learning libraries or Java’s tools whenever they want. That makes things smoother. For example, a financial services company can run dependable pipelines in Clojure and still plug in models from TensorFlow or PyTorch.

Concurrency + Stability + Interoperability → Clojure makes machine learning practical, reliable, and ready for real business.

Choosing a Machine Learning Development Company

Technical Depth vs. API Wrappers

Look, not all AI partners build things the same way. Some retailers only apply a simple API wrapper on an existing tool and consider it done. Sure, it is fast, but businesses won’t get anything unique or scalable out of it. The real value comes from teams that dig deeper- they design custom models, set up pipelines just for the enterprises, and actually integrate everything with the business. Quick fixes might get everything started, but they won’t last as business requirements grow. 

If your business wants something that scales with you, ignore the surface-level details and find an AI software development partner who understands how to build real AI systems from the start.

Ethical Standards

Accuracy is not the only thing that counts in AI. Responsibility matters just as much. The right company does not just build models- they make sure those models are fair, explainable, and transparent. Businesses should be able to trust their work, and so should the customers. Plus, with all the rules around AI these days, businesses need an AI development service that takes ethics seriously, not one that treats it like a checkbox at the end.

Who is Flexiana

Flexiana is a global machine learning development company with over 70 developers and more than 25 programming languages. We don’t do one-size-fits-all projects. Instead, we work on custom solutions designed for your business, not someone else’s. What really sets us apart?

  • We write clean, reproducible code, so businesses actually understand and trust the models.
  • We build for the long term, making sure the system can scale as you grow.
  • We bring a ton of experience, from AI and blockchain to complex enterprise systems.

Flexiana works with companies that want both technical smarts and strong ethical standards. We are not just delivering code- we are offering privacy‑first AI software development that lasts and protects your privacy at every step.

Build smarter with AI‑powered business intelligence (BI)– connect with our team.

FAQs on Getting Started with ML

Q1: How much data do I need for machine learning?  

Honestly, it ultimately depends on what you’re trying to build. Some models manage with just a few thousand records, while deep learning projects require millions. But here’s the thing: clean, relevant data beats large-scale almost every time. A good machine learning development company can help you figure out the right balance.

Q2: Is AI only for large enterprises?  

Not at all. Small and mid-sized businesses use AI regularly. With the right AI software development partner, even a small team can set up automation, create forecasting tools, or dig into customer insights. The scale might change, but the value’s there for everyone.

Q3: What’s the difference between AI and ML?  

AI (artificial intelligence) is the big idea- making machines act smart. ML (machine learning) is one way to do that. It learns patterns from data. So, AI is the goal, and ML is the method. Most AI development services use ML as their main engine.

Q4: What makes privacy important in AI?  

Privacy matters- a lot. When you take a privacy-first approach to AI software development, you handle data responsibly, keep models compliant, and build trust with users. Skipping this step just invites risk, no matter how good your models are.

Q5: Why choose machine learning with Clojure?  

Clojure really stands out for machine learning. Clojure’s immutable data makes experiments easy to repeat, and its concurrency lets you build real‑time pipelines that stay stable under heavy load. That’s why many teams choose Clojure for ML systems.

Q6: Can AI help with business intelligence?  

Definitely. AI‑powered Business Intelligence (BI) dashboards process live data and give decision‑makers instant insights. Companies don’t just spot trends or risks- they can act and respond before it’s too late.

In Summary 

Machine learning is not optional now- it is how companies survive. The ones jumping in early get to build systems that actually scale, stay on the right side of ethics, and turn AI into real results. Wait too long, and your business will be left scrambling while everyone else uses AI to move faster, save money, and identify opportunities your business will miss.

Here’s why it matters:

  • Scalability→ ML grows with you. No more systems slowing you down.
  • Trust→ Designing for privacy and keeping things transparent wins over customers and regulators.

Speed→ AI-powered business intelligence gives leaders real-time insights so they can act fast, before risks blow up.

Curious about what’s happening at Flexiana? Subscribe to our newsletter—it lands every two months, promise no spam!

The post How to Get Started with Machine Learning (2026 Implementation Guide) appeared first on Flexiana.

Permalink

Copyright © 2009, Planet Clojure. No rights reserved.
Planet Clojure is maintained by Baishamapayan Ghose.
Clojure and the Clojure logo are Copyright © 2008-2009, Rich Hickey.
Theme by Brajeshwar.