Check out Scout Monitoring’s ollama-rails GitHub repo for samples on how to use ollama-ai to communicate with Ollama.

Large Language Models (LLMs) have emerged as a game-changer, enabling machines to understand, generate, and process human language with unprecedented accuracy and fluency.

One such tool that has gained significant attention is Ollama, a cutting-edge platform that allows developers to run LLMs locally without relying on cloud services. Ollama provides a seamless and efficient way to harness the power of LLMs, making it an attractive choice for developers looking to integrate AI capabilities into their applications.

In this article, we will explore the integration of Ollama with a Ruby on Rails application. We’ll discuss the various problems that Ollama can solve and provide step-by-step guidance on incorporating Ollama into your Ruby on Rails project. Whether you are a seasoned developer or just starting with AI, this article will equip you with the knowledge and tools necessary to leverage the power of LLMs in your applications.

So, let’s discover how Ollama and Ruby on Rails can be combined to create intelligent and innovative solutions.

What is Ollama?

Ollama is a tool for running large language models locally without the need for a cloud service. Its usage is similar to Docker’s, but it’s specifically designed for LLMs. You can use it as an interactive shell, through its REST API, or from a Python library.

What problems do Large Language Models solve?

Natural Language Processing (NLP): Large language models can be used for various NLP tasks such as language translation, sentiment analysis, and topic modeling.

Text Generation: Models like the one we are using here can generate text, including articles, stories, and even entire books. This has applications in content creation, writing assistance, and language learning.

Question Answering: Large language models can answer questions based on the knowledge they have been trained on. This has applications in customer service, tutoring, and virtual assistants.

Dialogue Systems:  Models like the one we are using here can engage in conversations with humans, providing assistance or entertainment. This has applications in chatbots, voice assistants, and other interactive systems.

Language Translation: Large language models can translate text from one language to another. This has applications in document translation, website localization, and cross-cultural communication.

Summarization:  Models like the one we are using here can summarize long pieces of text, such as articles or documents, into shorter, more digestible versions. This has applications in news aggregators, research assistance, and content recommendation systems.

Conversational AI:  Models like the one we are using here can be used to create conversational AI systems that can engage in natural-sounding conversations with humans. This has applications in customer service, virtual assistants, and other interactive systems.

Language Learning: Large language models can learn new languages by generating text in the target language based on a given prompt or input. This has applications in language learning and language teaching.

Content Generation: The model we’re showing here can generate content, such as articles, blog posts, or social media updates, based on a given topic or prompt. This has applications in content creation, writing assistance, and marketing automation.

These are just a few examples of the many potential applications of large language models.

Why we want to integrate with Ruby on Rails

Ollama supports official CLI and REST API integration and usage. But there are many community integrations, such as:

– Web & Desktop

– Terminal

– Database

– Package managers

– Libraries

– Mobile

– Extensions & plugins

The complete list of integrations can be found on the Ollama official Github. You can also find all the models supported by Ollama, with further details available at ollama.com/library

We will use the ollama-ai gem for our integration. This gem is a Ruby gem, so it can be used on a plain Ruby file, but the project that we will provide with all the code snippets for this article will be a simple Ruby on Rails project.

ollama-ai integration

The ollama-ai gem provides the following methods.

Methods:

– client.generate

– client.chat

– client.embeddings

Model methods:

– client.create

– client.tags

– client.show

– client.copy

– client.delete

– client.pull

– client.push

We are going to create a simple example using the method generate. First, we are going to create a Ruby on Rails project:

rails new ollamaai

Then we need to add the ollama-ai gem to the project Gemfile:

gem 'ollama-ai', '~> 1.2.1'

To keep the code organized, we can create a controller:

rails generate controller ollamaai

The next step is to create a method in the controller and a corresponding view file:

#ollamaai controller method
    def index
      client = Ollama.new(
        credentials: { address: 'http://localhost:11434' },
        options: { server_sent_events: true }
      )

      @result = client.generate(
        { model: 'llama2',
          prompt: 'Hello llama2!',
          stream: false }
      )
      return @result
    end

We can move the client creation to another method and execute it with the before_action callback. This way, all the examples that we create are not going to need to create the client on their methods:

  before_action :create_client

    def create_client
      @client = Ollama.new(
        credentials: { address: 'http://localhost:11434' },
        options: { server_sent_events: true }
       )
    end

We only need to change the client to @client in the index method. The view to display the result from our ollama.generate method could look like this:

#ollamaai/index.html.erb
<h1>Ollama-ai examples</h1>

<% if @result.present? %>

   <pre><%= JSON.pretty_generate(@result) %></pre>
<% end %>

We also need a route for the new controller method:

#routes.rb
Rails.application.routes.draw do
  root "ollamaai#index"
  #get '/', to: 'ollamaai#index'
end

Using root or get is equivalent in this case. You may choose your preferred option, but the root is usually used for the home or landing page in a real-world or more extensive application.

At this point, we can try calling Ollamaai by visiting / or localhost:3000.

But it will fail because we must install Ollama locally for ollama-ai to work. Ollama can be downloaded and installed following the instructions on Ollama’s official site.

Now we can try again, and it will work. Make sure that the Ollama service is running. It may take a while to execute/load the first time, and it also depends on which method we call and the parameters.

The response should look like this:

Ollama-ai examples

[

  {
    "model": "llama2",
    "created_at": "2024-04-24T12:42:13.810473Z",
    "response": "n*giggles* Hi there, cutie! *blinks* How are you today?  ",
    "done": true,
    "context": [
      518,
      25580,
      29962,
      3532,
      14816,
      29903,
      29958,
      5299,
      829,
      14816,
      29903,
      6778,
      13,
      13,
      10994,
      11148,
      3304,
      29906,
      29991,
      518,
      29914,
      25580,
      29962,
      13,
      13,
      29930,
      29887,
      22817,
      793,
      29930,
      6324,
      727,
      29892,
      5700,
      347,
      29991,
      334,
      2204,
      19363,
      29930,
      1128,
      526,
      366,
      9826,
      29973,
      29871,
      243,
      162,
      147,
      179,
      243,
      162,
      149,
      152
    ],
    "total_duration": 8460402702,
    "load_duration": 1904410860,
    "prompt_eval_count": 25,
    "prompt_eval_duration": 1698462000,
    "eval_count": 31,
    "eval_duration": 4856036000
  }
]

The context attribute is displayed in many lines because we used JSON.pretty_generate, which makes JSON outputs more readable. You can replace JSON.pretty_generate(@result) for @result, and the output will be plain/raw. Then, you can format it as you prefer.

Generate

In the previous example, we used the generate method. But we can also use it like this.

    def generate
      @result = @client.generate(
      { model: 'llama2',
        prompt: 'Hello from Ruby on Rails!' }
      )
      render template: "ollamaai/index"
    end

The difference in this method is that we removed stream: false, which means that the result will be an array of events.

We also need to create the route:

#routes.rb
get '/generate', to: 'ollamaai#generate'

The code for the view can be the same as #ollamaai/index.html.erb

Chat

Generate the following message in a chat with a provided model. Let’s create an example that uses this method. First, we create the route:

#routes.rb
get '/chat', to: 'ollamaai#chat'
    def chat
      @result = @client.chat(
        { model: 'llama2',
          messages: [
            { role: 'user', content: 'Hi! My name is Ruby on Rails Developer' }
          ] }
      ) do |event, raw|
          # This outputs to stdout but @result also get's the response events
          puts event
        end
      render template: "ollamaai/index"
    end

The code for the view can be the same as #ollamaai/index.html.erb

Embeddings

Generate embeddings from a model. First, we create the route to create an example for this method.

#routes.rb
get '/embeddings', to: 'ollamaai#embeddings'

Then, we add a new method to the controller:

    def embeddings
      @result = @client.embeddings(
        { model: 'llama2',
          prompt: 'Hello!' }
      )
      render template: "ollamaai/index"
    end

The code for the view can be the same as #ollamaai/index.html.erb

Model create

We can also create our models. The following method creates a model from a Modelfile. It is recommended to set the modelfile parameter to the content of the Modelfile rather than just setting a path to the file itself.

First, we create the example route:

#routes.rb
get '/model/create', to: 'ollamaai#model_create'

Then, we add a method to the controller:

    def model_create
      @result = @client.create(
        { name: 'mickey',
          modelfile: "FROM llama2nSYSTEM You are Mickey Mouse from Disney." }
      ) do |event, raw|
        puts event
      end
      render template: "ollamaai/index"
    end

The code for the view can be the same as #ollamaai/index.html.erb

To use the model we just created, we can use the method generate, but we must specify in the model attribute our newly created model, like this:

    def model_use
     @result = @client.generate(
      { model: params[:name],
        prompt: 'Hi! Who are you?' }
      ) do |event, raw|
          print event['response']
        end
        render template: "ollamaai/index"
    end

But we also need to create a route for this generate call:

#routes.rb
get '/model/use/:name', to: 'ollamaai#model_use'

The code for the view can be the same as #ollamaai/index.html.erb

Model show

Ollama also provides a show method. It returns information about a model, including details, modelfile, template, parameters, license, and system prompt.

First, we create the route for the example:

#routes.rb
 get '/model/show', to: 'ollamaai#model_show'
    def model_show
      @result = @client.show(
        { name: 'llama2' }
      )
      render template: "ollamaai/index"
    end

The code for the view can be the same as #ollamaai/index.html.erb

There are other model methods, including the ones we mentioned at the beginning of this section. However, we will not cover all of them in this article.

Image processing

As we showed in the previous examples, the generated methods and chat can also be used to process images besides text generation or conversation. You need to use a model that supports images, like LLaVA or Bakllava. The image needs to be encoded in Base64.

First, we need to download/pull the model that we are going to use; we are going to get it using Ollama directly; for that, you execute the following command in the terminal:

ollama pull llava

We must also add the image we want in the example to the project’s public folder. Then, we define a route:

#routes.rb
get '/image', to: 'ollamaai#image'

We should also change the client timeout parameters because image processing takes longer than text processing:

client = Ollama.new(
  credentials: { address: 'http://localhost:11434' },
  options: {
    server_sent_events: true,
    connection: { request: { timeout: 120, read_timeout: 120 } } }
)

For the images, we need to add require ‘base64’ to the top of the controller, and then we create a new method in the controller:

    def image
      @result = @client.generate(
      { model: 'llava',
        prompt: 'Please describe this image.',
        images: [Base64.strict_encode64(File.read('public/piano.jpg'))] }
      ) do |event, raw|
          print event['response']
        end
        render template: "ollamaai/index"
    end

The output should be something like:

“The image is a black and white photo of an old piano, which appears to be in need of maintenance. A chair is situated right next to the piano. Apart from that, there are no other objects or people visible in the scene.”

This article’s examples or code snippets can be found in the following GitHub repository. All the available methods with their corresponding options and parameters can be found in the ollama-ai API documentation.

Similar tools

There are tools that provide Ollama-like features; some of them are:

– Ava PLS

– local.ai

– lmstudio.ai

These tools also allow you to execute LLM locally. But they differ in their features and usage.

Implementation and usage considerations

Community. The model ollama and the gem ollama-ai have a large and active community of users who use and support them. This translates into more community integrations, frequent updates, improvements, and general support. So, you could expect that the model and the gem improve quickly in time. This is important when choosing which model you are going to use for your projects.

Nano Bots. The Gem ollama-ai has been created to help people gain access to Ollama at a low level, which can be used to build more complex structures. If you are looking for more user-friendly tools or high-level abstractions, you may want to explore Nano Bots.

The Nano Bots gem is an implementation of the Nano Bots specification; it supports the following models: Cohere Command, Google Gemini, Maritaca AI MariTalk, Mistral AI, Ollama, OpenAI, and ChatGPT, among others. Other features worth mentioning are:

– Cartridges, which are functions that extend Nano Bots features and functionality, when creating this functions you need to specify a provider and a model that supports them.

– Security and Privacy, such as Cryptography, End-user IDs, and Decrypting.

– Model execution using Docker.

Performance. When using the LLMs locally, they may run slower than we are used to in their cloud versions, but this is normal because they usually use a large amount of resources. If the execution time is insufficient, many previously mentioned models have a cloud version available. 

Large language models have many different uses. They keep improving day by day and at a very fast pace. New features, better responses, more integrations, and easier to use are some areas where we notice more improvements. If you are building an app on top of a particular LLM, we encourage you to try as many options as possible because one may work better for your use cases. LLMs probably keep continuing to surprise us as the wide adoption of ChatGPT did.