Python v/s Ruby; Performance and other factors that matter
In this blog post, we'll be going through two server-side scripting languages; Python and Ruby with focus on comparing the performance and other factors that might help you in deciding which language to pick over the other for your web application.
Let's begin with performance first,
What does performance mean?
For the context of this post, you can think of high performant language as the one that,
- Provides fast code execution in genral
- Handles concurrent tasks efficiently
- Has low utilization of computing resources (typically the CPU utilisation and memory footprint)
And a high performance web framework as the one that,
- Has short response time
- Provides high throughput (typically responses-per-minute)
- Provides fast and efficient serialization and deserialisation
- Has high availablity and fault tolerance
- Scales better with more resources when the load increases.
Comparing performance of Python with Ruby
We're aware that "real" comparison (a.k.a benchmarking) would require a lot of standardisation in terms of execution environment. I'll be running the code snippets in this post on my i5 machine having 4 cores and 8 GB RAM, taking measures to reduce external influence as muc has possible. Let's start with evaluating the execution times of both of these languages for simple iterative and recursion based programs.
Comparing run times of simple iterative and recursive programs in Python and Ruby
We'll take two well known mathematical problem statements,
- Compute
n
th value in the fibonacci sequence. - Compute factorial of
n
.
Here are our simple implementations for the same,
Note: It can be argued that these programs are not equivalent in terms of implementation in their respective languages, and a faster verision can be written for them. Written that is beyond the scope of this blog-post (for reference here's what would be an equivalent implementation of pidgits in Python and Ruby would look like).
The point of the above green-apples to red-apples like comparison is to practically see if there's a noteworthy difference in the execution times among the "typical" implementations of these programs in respective langauges. This is going to be the theme of the entire post.
# Python version 3.6.9 (CPython implementation)
def fib(n):
# Iterative fibonacci
a, b = 0, 1
for i in range(0, n):
a, b = b, a + b
return a
def fib_r(n):
# Recursive fibonacci
return n if n < 2:
return fib_r(n-1) + fib_r(n-2)
def fac(n):
# Iterative factorial
x = 1
for i in range(2, n + 1):
x = x * i
def fac_r(n):
# Recursive factorial
if n >= 1:
return n * fac_r(n - 1)
return 1
# Printing out the run times, the value of n is decided based on execution times and maximum stack depth
print(timeit.timeit(lambda: fib(1000000), number=1))
print(timeit.timeit(lambda: fib_r(40), number=1))
print(timeit.timeit(lambda: fac_r(900), number=1))
print(timeit.timeit(lambda: fac(100000), number=1))
# Ruby version 2.6.5 (CRuby implementation)
require 'benchmark'
def fib(n)
# Iterative fibonacci
a, b = 0, 1
for i in 0..n
a, b = b, a + b
end
end
def fib_r(n)
# Recursive fibonacci
return 1 if n < 2
return fib_r(n - 1) + fib_r(n - 2)
end
def fac(n)
# Iterative factorial
x = 1
for i in 2..n + 1
x = x * i
def fac_r(n)
# Recursive factorial
if n >= 1:
return n * fac_r(n - 1)
return 1
# Printing out the run times, The value of n is decided based on execution times and maximum stack depth
puts Benchmark.measure { fib(1000000) }
puts Benchmark.measure { fib_r(40) }
puts Benchmark.measure { fac(100000) }
puts Benchmark.measure { fac_r(900) }
Following are the average execution times after running theses scripts at 7 different points of time. I tried to make sure no other process was running to reduce bias. The n
value is adjusted so that the program doesn't take too long and doesn't throw "Maximum recursion depth exceeded error" (happened with Python). Here are the observations,
Method | n |
Ruby | Python |
---|---|---|---|
fib |
1000000 | 27.935831 s | 10.435885478975251 s |
fib_r |
40 | 9.442680 s | 36.948102285154164 s |
fac |
100000 | 6.833936 s | 2.502855138000001 s |
fac_r |
900 | 2.701 ms | 0.643335000006573 ms |
And here are some of our observations,
- Python is in the magnitude of 2.5x faster than Ruby when it comes to computations with typical for loop iteration. The slowness of Ruby here is due to the introduction new scope for every iteration, which involves creation and deletion of these varaibles in every iteration.
- Ruby is sometimes significantly faster, sometimes slower when it comes to recursion. It is commonly known that funcion call overhead is expensive in Python. Both of the languages have mutliple optimizatizations for dealing with exploding call stacks just like in the case of our naive implementation of fibonacci and factorial.
If you're interested in benchmarking of these languages against programs like fannkuch-redux, fasta, k-nucleotide, mandlebrot, nbody etc, Benchmarks Game's Ruby vs Python 3 comparison is highly recommended (similar source).
Moving on, let's see how these languages perform when it comes to reading files from disk and parsing common formats like JSON.
Comparing run times for file reading from disk and JSON parsing programs in Python and Ruby
For the data I've taken one of my scraping data dump in JSON which is 5.5 Mb in size and it's schema looks something like below
{
"source": "https://www.startupranking.com",
"data": [
{
"country": "United States",
"startups": [
{
"sr_rank": "93,324",
"name": "Airbnb",
"overall_rank": "1",
"url": "https://www.startupranking.com/airbnb",
"country_rank": "1",
"pitch": "Vacation Rentals, Homes, Experiences & Places\n - Airbnb is a trusted online marketplace for people ...",
"fundingRounds": [
{
"date": "Jun 28, 2015",
"amount": "$ 1,500,000,000",
"name": "Series E",
"investors": [
"General Atlantic",
"Hillhouse Capital",
"Tiger Global",
"Baillie Gifford",
"China Broadband Capital",
"Fidelity Investments",
"Ggv Capital",
"Horizon Ventures",
"Kleiner Perkins Caufield Byers",
"Sequoia Capital",
"T Rowe Price",
"Temasek",
"Wellington Management",
"Groupe Arnault",
"Horizons Ventures"
]
},
{
"date": "Apr 16, 2014",
"amount": "$ 475,000,000",
"name": "Series D",
"investors": [
"Dragoneer Investment Group",
"Sequoia Capital",
"Sherpa Ventures",
"T Rowe Price",
"Tpg Growth",
"Andreessen Horowitz"
]
},
{
"date": "Oct 28, 2013",
"amount": "$ 200,000,000",
"name": "Series C",
"investors": [
"Founders Fund",
"Ashton Kutcher",
"Crunchfund",
"Sequoia Capital",
"Airbnb"
]
},
...
...
]}]}]}
It's nested enough and contain maps as well as arrays. The next task is to create methods to read this file and then load that as JSON.
# Python version 3.6.9 (CPython implementation)
# Importing the in-built json module for parsing
import json
def read_file(path):
# Reading file contents from path
with open(path, 'r') as f:
content = f.read()
return content
def load_json(path):
# Parsing json from stored in file
return(json.loads(read_file(path)))
# print(timeit.timeit(lambda: read_file('data.json'), number=1))
print(timeit.timeit(lambda: load_json('data.json'), number=1))
# Ruby version 2.6.5 (CRuby implementation)
# Importing the in-built json module for parsing
require "json"
def read_file(path)
# Reading file from `path`
return File.read(path)
end
def load_json(path)
# Parsing JSON from file
JSON.parse(read_file(path))
end
# puts Benchmark.measure { read_data("data.json") }
puts Benchmark.measure { load_json("data.json") }
Nothing fancy, just using in-built ways to read a file and parse JSON from string, and recording their execution times one-by-one. Here are the results,
Method | Ruby | Python |
---|---|---|
read_file |
4.676 ms | 6.013702999553061 ms |
load_json |
96.573 ms | 48.90625600000931 ms |
Observations
- Reading from file is slightly faster in Ruby then in Python.
- However, parsing json using standard library methods takes almost twice as time in Ruby than in Python.
Concurrency in Python and Ruby
Coming to concurrency, popular implementations of both the languages (CPython and Ruby) are blessed with Global Interpreter lock, which means,
- Only one thread can execute at a time on a CPU, even if you have a multi-core processor.
- In essence, you can create multiple threads but they will run turn-by-turn instead of running in parallell (concurrency without parallelism).
- Parallell I/O is still possible (and happens) among multiple threads.
- To achieve parallelism with processing, the program will need to spawn separate processes, and kind of coordinate with them.
Python provides some abstraction for performing multiprocessing through the built-in multiprocessing
module, and Ruby provides the Process
module which is more closer to OS level. For parallelisation of I/O related tasks, Python included asyncio module from 3.x onwards, and the module received significant usability and performance improvements in the recent Python 3.7.x version. Popular third-party options in Ruby are the async framework and the concurrent-ruby toolkit. There's an proposed pull request in Ruby for fibre-based selector that will enhance concurrency.
Comparing performances of web frameworks in Python and Ruby
I'm going to take popular minimalistic web frameworks Flask and Sinatra in the respective languages, and compare their response times for the following functions through REST APIs
- Simple GET request
- Simple POST request
- Rendering a JSON response from an already intialized variable
- Instantiating a new object and then rendering a JSON response
- Rendering an HTML response via templating
Here's the code for all this:
# Filename: app.py
# Flask version: 1.1.1
from flask import Flask, jsonify, render_template
app = Flask(__name__)
# Already initialized list of languages
languages = [
{
"name": "Python",
"is_interpreted": True,
"version": "3.6.9"
},
{
"name": "Ruby",
"is_interpreted": True,
"version": "2.6.5"
},
]
class Language:
# Our language class
def __init__(self, name, is_interpreted, version):
self.name = name
self.is_interpreted = is_interpreted
self.version = version
# Simple GET request
@app.route("/simple-get")
def get():
return "Hello Scout!"
# Simple POST request
@app.route("/simple-post", methods=["POST"])
def post():
return "Hello Scout!"
# Rendering a JSON response
@app.route("/simple-json", methods=["GET"])
def render_json():
return jsonify(languages)
# Instantiating an object and then rendering a JSON response
@app.route("/simple-json-2", methods=["GET"])
def render_json_custom_object():
lang = Language(**{
"name": "Python",
"is_interpreted": True,
"version": "3.6.9"
})
return jsonify(lang.__dict__)
# Rendering an HTML response via templating
@app.route("/render-html", methods=["GET"])
def render_html():
return render_template('template_python.html', languages=languages)
if __name__ == "__main__":
app.run(debug=False)
<!-- Filename templates/template_python.html -->
<html>
<head>
<title>Languages comparison</title>
</head>
<body>
{% for language in languages %}
<h1> {{ language["name"] }} </h1>
<ul>
<li>Is interpreted: {{ language["is_interpreted"] }}</li>
<li>Current version: {{ language["version"] }}</li>
</ul>
{% endfor %}
</body>
</html>
# File app.rb
# Sintara version 2.0.7
require 'sinatra'
# Already initialized array of languages
languages = [
{ :name => 'Ruby', :is_interpreted => true, :version => '2.6.5' },
{ :name => 'Python', :is_interpreted => true, :version => '3.6.9' },
]
class Language
# Our language class
attr_accessor :name, :is_interpreted, :version
def initialize (name, is_interpreted, version)
@name = name
@is_interpreted = is_interpreted
@version = version
end
def as_json(options={})
{
name: @name,
is_interpreted: @is_interpreted,
version: version
}
end
def to_json(*options)
as_json(*options).to_json(*options)
end
end
# Simple GET request
get '/simple-get' do
"Hello Scout!"
end
# Simple POST request
post '/simple-post' do
"Hello Scout!"
end
# Rendering a JSON response
get '/simple-json' do
content_type :json
languages.to_json
end
# Instantiating an object and then rendering a JSON response
get '/simple-json-2' do
language = Language.new("Ruby", true, "2.6.5")
content_type :json
language.to_json
end
# Rendering an HTML response via templating
get '/render-html' do
erb :template_ruby, :locals => {:languages => languages}
end
<!-- Filename views/template_ruby.erb -->
<html>
<head>
<title>Languages comparison</title>
</head>
<body>
<% languages.each do |language| %>
<h1><%= language["name"] %></h1>
<ul>
<li>Is interpreted: <%= language[:is_interpreted] %></li>
<li>Current version: <%= language[:version] %></li>
</ul>
<% end %>
</body>
</html>
I used Postman to record the response time information, you can also use cURL to do the same. Here are the observations,
Endpoint | Flask response time (in ms) | Sinatra response time (in ms) |
---|---|---|
/simple-get |
||
/simple-post |
||
/simple-json |
||
/simple-json-2 |
||
/render-html |
You can also choose to add Scout to your application here to monitor the response times. Here's how you'd set up one for flask,
from flask import Flask
from scout_apm.flask import ScoutApm
# Setup a flask 'app' as normal
app = Flask(__name__)
# Attach ScoutApm to the Flask App
ScoutApm(app)
# Scout settings
app.config["SCOUT_MONITOR"] = True
app.config["SCOUT_KEY"] = "YOUR_SCOUT_API_KEY"
app.config["SCOUT_NAME"] = "flask_endpoints"
And here's how you'd do it for Sinatra
equire 'sinatra'
require 'scout_apm'
ScoutApm::Rack.install!
run Sinatra::Application
get '/simple-get' do
# Letting Scout know that to track a specific request as a Rack transaction
ScoutApm::Rack.transaction("get /simple-get", request.env) do
"Hello Scout!"
end
end
Here are some observations,
- Most of the response timings are equivalent, templating is slightly faster in Flask. Same goes with rendering a JSON response from a class object.
- Sintara saved some time on "DNS Lookup" and "TCP Handshake" part of the final time by caching.
I wanted to compare Network performance and Database operations performance as well, but ditched the idea because of the language specific differences in the implementation of the database drivers and network libraries. Anyways, just like language comparisons, if you're interested in more technical benchmarking of Python and Ruby framework, I'd recommend you to check out this link.
In a real-world scenario, web framework speeds might be just one part of the big story. A request-cycle might consist of following critical components of interest in sequential order (from the moment when the client requests triggers a request),
- Load balancers like HAProxy
- Web accelerators like Varnish and Squid
- Web servers like nginx (nginx by the way can also take up the job of accelerator and load balancer)
- Application servers like Unicorn and Gunicorn
- The frameworks like Ruby On Rails and Django
- Caches at the application level like Redis and Memcached
- Finally the I/O in the form of disk, databases, network, etc.
The frameworks / technologies used at each of these steps will also contribute to the final response times. So choosing the right design is very critical here.
Differences beyond Performance
So far from our analysis it's evident that some things are slower in one language, some things in other. A lot of these differences are because of the design philosophy of the languages, and how the languages evolved over the time. Also, there can be more reasons to pick a language among Python and Ruby other than performance. Let's go through them next before we conclude the post,
Design philosophies of Python, Ruby and their frameworks
Ruby is designed to be a friendly language keeping programmer's comfort in mind. The core priniciple in Ruby is "The principle of least surprise". As a result Ruby has a lot of high level functionalities to make programming enjoyable, some programmers also called Ruby and frameworks like Ruby on Rails "magical" in that sense.
On the other hand, the core philosophy of Python is aptly summarized in the Zen
>>> import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
The major theme is towards being explicit, and encourage a particular way of doing things with Python. This is slightly in contrast with Ruby, where a lot of things happen implicitly and there are multiple ways to do the same thing. You can notice some of these subtle differences in the way these languages deal with,
- Unicode strings and byte strings (Ruby is more implicit about the encodings)
- Switch statements (Python has only
if-else
and noswitch
construct) - Anonymous functions (Python has only one way; lambdas, while Ruby contains blocks, Procs, and lambdas)
- Getter and setters (Python has descriptor syntax to access instance varaibles, whereas in Ruby you can specify
attr_reader
and attr_writer accessors or you can write explicit getter-setter methods) for
loops (Python has typicalfor x in y
way, whereas Ruby has multiple ways liken.times do
,collection.each do |item|
, along withfor x in y
)
Talking about the most popular frameworks in these languages (Rails and Django), Rails is an integral skillset of most of the Ruby programmers, and some people argue to an extent that Rails is what that has kept the language alive. Unlike Ruby as a programming language, Rails is designed to be strongly opinionated favoring convention over configuration due to which it is considered good for fast prototyping and quick iterations (Rails will do all the heavy-lifting if you do things the Rails way). Django on the other hand is more explicit. It demands the programmer to configure different aspects of the application and thus involves a slight learning curve. You can see similar differences in other framework comparisons like Sintara and Flask. Both the approaches have their pros and cons, and one may outweigh the other depending on your use case.
Community
Python's community has really been growing rapidly due to its suitability in domains beyond web applications (like Data Analytics, Image Processing, Deep Learning, etc). An increasing interest has given a great boost to the language in past few years in terms of features, performance, and supporting packages. The most active community for Ruby is the Rails community, so Rails as a framework is still growing decently.
Dependency Management
Python's dependency management ecosystem is slightly more matured and developer-friendly than Ruby's. I find myself in dependency hell in Ruby more often as compared to Python, mostly becuase of tricky ways to manage isolated environments in Ruby unlike Python's virtual environments. The other aspect is that pypi (Python's package index) is more versatile when it comes finding reusable libraries that are actively maintained and to avoid re-inventing the wheel. At the time of writing this post, there are 191,743 python packages in PyPI and 155,401 gems hosted at RubyGems.
Testing and debugging
Debugging in Ruby has been slightly more difficult from my personal experience. However, it's still more friendly than most of the other languages. For testing, RSpec is widely used to do Behavior-driven development (BDD). In Python, the popular BDD framework is behave followed by pytest plugins like pytest-bdd. Developers find Rspec to be more matured than Python alternatives.
Python and Ruby's current usage in real-world Web development
Both of these languages find usages in tech stack of large scale websites. Some examples being,
Some popular websites that use Ruby
Some popular websites that use Python
Conclusion
In this post, we tried to evaluate performance of Python, Ruby, and their frameworks for simple but commonly performed tasks. There are certain cases where one language shines over other, but only performance doesn't seem like a good reason to pick one of these language over the other because,
- Developers matter more: The per hour CPU costs in the cloud are cheaper than per hour developer time.
- In most business cases solving the problem first (getting product-market-fit) is more important then focusing on performance.
- For large scale web applications, performance is more of a design-architecture game than of picking one language among the two.
- If language-performance is really what you want, then there are other low-level languages (probably the compiled ones) which can do much better.
No language among these two can be objectively said better among each other. Recent StackOverflow developer survey results are slightly more favorable to Python and it's frameworks, but both of these languages are happily used and even supported by large scale companies. The better reasons for choosing a language among these two can be can be,
- The community support for your use-case
- The developer team's familiarity and preference
- Necessary third-party support in terms of reusable packages and their ease of use (documentation)
- The level of control that you need (configuration vs convention)
- The speed at which you want to develop your application
Anyways, no matter which language you end up choosing among these, Scout is available for all of them :) Hope this article helped you inch closer in your decision to pick one language out of these two. Anyways, no matter which language you end up choosing, Scout has got your application monitoring needs covered!