Python vs. Java: Comparing Two Popular Programming Languages
In this article, we'll be comparing the features of two server-side programming languages; Python and Java. Let's begin with some design differences in both the languages.
Fundamental differences in the design and implementation of Python and Java
History
Having an idea of the past can provide us the context to understand why things were built the way they are now. Python was created to bridge the gap between C and the shell. It was intended to be a higher-level interpreted language that enables clean, concise, and readable code.
Java was created to be a compiled, platform-independent Object-Oriented programming language. The intention was to achieve code portability (write once run everywhere) with little or no programmer effort. One of the early applications of Java was incorporation into browsers like Netscape, and it soon became popular.
You'll find most of the legacy systems and enterprise-level web applications programmed in Java than in any other language. And you'd often find Python being used a "glue" to combine different components in these systems.
Design
There are a lot of great resources on the internet that explains the design differences in-depth, so we won't dive into those, but instead, I'll mention a few "simplified" takeaways,
- The most fundamental design difference is that Python is an interpreted language, and Java is a compiled language. This difference dictates a lot of the features and limitations of both languages.
- Any programming language must translate the code written by the programmer into a set of instructions, or machine code, that can be executed on the machine. In interpreted languages, this process happens on-the-fly while executing the program, whereas compiled languages do some pre-processing before executing the program.
- Java compiler converts the code into a platform-independent bytecode, which can then be loaded and executed on any instance of Java Virtual Machine (JVM). Similarly, the Python code is processed into Python bytecode and runs in Python Virtual Machine.
- However, the difference is that while Python compiles to bytecode at runtime, Java compiles in advance. Java runtime also consists of a Just-in-time (JIT) compiler, which improves efficiency by being able to compile the bytecode into machine code in "almost real-time."
Consequences of the design and history
On semantics
- Python is a dynamically-typed language, meaning it infers variable types on its own. Java, on the other hand, is a statically-typed language, which means variable types should be declared explicitly.
- In Python, you can worry less about variable types and focus more on the logic. So if you write Pythonic code (Example), you can do more in less lines of code as compared to Java. And on top of that, the indentation rules make the code inherently more readable.
- Java is strict in the sense that the programmers need to write verbose code. Many mistakes can be caught during the compile time in Java. You have more flexibility and control in terms of adhering to various design patterns in Java as compared to in Python.
On performance
Many luxuries in the CPython implementation of Python (the most widely used Python implementation) come at the cost of,
- A slower run time because of more work needed to translate Python code to machine-level code. Java does many things like type checking, locating memory addresses for different identifiers. During pre-processing (generation of bytecode), and static typing provides opportunities for optimization during the run time as well.
- More chances of getting errors (related to type checking and conversions) during run time.
- A higher memory footprint of objects in Python.
Concurrency in Python
CPython implements a Global Interpreter Lock to ensure thread-safety, which means,
- Only one thread can execute at a time on a CPU, even if you have a multi-core processor.
- In essence, you can create multiple threads, but they run turn-by-turn instead of running in parallel (concurrency without parallelism). Parallel I/O is still possible (and happens) among multiple threads.
- To achieve parallelism with processing, you need the program to spawn separate processes and coordinate with them. These processes can be instances of interpreters executing Python code or low-level programs like C-extensions.
Python provides some abstraction for performing multiprocessing through the built-in multiprocessing module. For parallelization of I/O related tasks, Python included asyncio module which received significant usability and performance improvements in the recent Python 3.7.x version.
Concurrency in Java
Java Virtual Machine (JVM) is capable of executing multiple threads in parallel on multiple CPU cores. The programmers have to deal with the complexities of dividing their tasks into threads and synchronization between them. Java provides Thread class and Java .util.concurrent package containing some abstractions for multi-threading. The fact that most of the popular distributed computation frameworks (like Spark and Hadoop) are primarily written in Java is evidence of its suitability for concurrent execution.
Note: We discussed the most popular implementation of Python (CPython) in this section. There are other implementations as well, which make some other trade-offs for the sake of performance and to support parallel execution (take a look at the pypy project and Stackless Python, which supports JIT compilation and concurrency).
Comparing simple iterative and recursive programs in Python and Java
We'll take two well known mathematical problem statements,
- Compute n'th value in the Fibonacci sequence.
- Compute factorial of n.
Following are the simple implementations for the same, you can observe some of the differences that we discussed in above section in the code and the results.
# Python version 3.8.0 (CPython implementation)
def fib(n):
# Iterative fibonacci
a, b = 0, 1
for i in range(0, n):
a, b = b, a + b
return a
def fib_r(n):
# Recursive fibonacci
if n < 2: return n
return fib_r(n-1) + fib_r(n-2)
def fac(n):
# Iterative factorial
x = 1
for i in range(2, n + 1):
x = x * i
def fac_r(n):
# Recursive factorial
if n >= 1:
return n * fac_r(n - 1)
return 1
# Printing out the run times, the value of n is decided based on execution times and maximum stack depth
print(timeit.timeit(lambda: fib(60), number=1) * 1000)
print(timeit.timeit(lambda: fib_r(40), number=1))
print(timeit.timeit(lambda: fac_r(25), number=1) * 1000)
print(timeit.timeit(lambda: fac(25), number=1) * 1000)
/*
Java version 11.0.3
Please excuse me for using `snake_case` in the program.
*/
public class SimpleMethodsPrimitive {
public static void main(String args[]) {
long start_time = System.nanoTime();
fib(60);
// fib_r(40);
// fac_r(25);
// fac(25);
long stop_time = System.nanoTime();
// Printing out run time in nanoseconds
System.out.println(stop_time - start_time);
}
private static long fib(int n) {
// Iterative fibonacci
long a = 0, b = 1;
for (int i = 0; i < n; i++) {
a = b;
b = a + b;
}
return a;
}
private static int fib_r(int n) {
// Recursive fibonacci
return n < 2 ? n: fib_r(n-1) + fib_r(n-2);
}
private static long fac(int n) {
// Iterative factorial
long x = 1;
for (int i = 2; i < n + 1; i++) {
x = x * i;
}
return x;
}
private static long fac_r(int n) {
// Recursive factorial
return n < 1 ? 1: n * fac_r(n -1 );
}
}
Beyond design - The Development Ecosystem, Libraries and Frameworks
Developer productivity is an essential factor in deciding which language to choose from. Let's take a look at the ecosystem and libraries that support developer productivity.
Dependency management and Code distribution
Java code is packaged and distributed in the form of .jar files, whereas in Python, it is distributed in the form of .whl files. Package management in Java is relatively stable but more complex to learn.
In Python, pip is pretty much what you'll need to know about in most of the use-cases to manage dependencies. PyPI (Python's package index) is the place where the packages are hosted so that anyone can use them.
PyPI's equivalent in Java is MVNRepository, and the dependencies are specified in the configuration files of build-automation tools like Apache Maven and Gradle. Python now has built-in support from virtual environments (isolated dependency environment specific to projects); a similar thing can be achieved in Java using classpaths.
Libraries and Frameworks supporting typical web development
Java has a strong JDBC (Java DataBase Connectivity) API for and connecting to databases, which is also the reason why Java language has been the popular choice among the enterprise systems. Python's database access layers are slightly more challenging to deal with, as compared to Java. Both languages have ORM capabilities.
It is tough to write the entire backend from scratch, so both the languages have frameworks that provide an abstraction to set-up a reliable and secure backend without reinventing the wheel. Spring is by far the most popular web frameworks in Java, whereas Django and Flask are the two popular web frameworks in Python.
In terms of performance, Java web frameworks are faster, but the Python frameworks are also not far behind (see the benchmarks here). Spring has a LOT of production-friendly dependencies to deal with caching, authentication, databases, messaging, and whatnot, which means the developers can focus just on business logic. The downside of Spring is the big learning curve it has (because of things like dependency injection, verbose configurations, and more), with some developers even describing their early learning experiences as "black magic." It is also much more resource-intensive as compared to Django or Flask. The resource overhead of Spring can sometimes seem unjustified for small-to-medium size web applications.
Debugging and Testing
Both the languages are easy to debug, but I've personally found stack-traces and exceptions in Python to be more helpful. Another thing that sticks out for me is the build-time is usually much faster in Python as compared to Java (since Python is an interpreted language), which is excellent when you are doing hit-and-trial style debugging. This might be true because Java codebases are typically more substantial and more complex.
If you use a modern IDE or static code analysis tools, these can prevent many errors beforehand in both the languages, and be able to add breakpoints to inspect the variables during runtime.
Java has various popular libraries at various levels of abstraction (like Junit, TestNG, PowerMock) to unit-test your code. Python has a built-in unit test library whose design was inspired by Java's JUnit framework. Other higher-level frameworks for unit-testing in Python include pytest and nose. Unit testing in Python requires slightly more effort because of its dynamically typed nature.
When it comes to Behavior Driven Development, the most popular BDD framework in Python is behave, followed by pytest plugins like pytest-bdd. In Java, popular choices are Cucumber and Spock. Selenium, the most popular web-automation testing framework, is primarily written in Java. It is easier to find solutions to your issues when you're using their Java API (Selenium has a Python API, too) to do things like end-to-end automation testing.
Documentation also helps in debugging and testing well. Python has a built-in doctest module that mixes well with the interactive nature of the language, as it helps in writing interactive statements in the documentation that serve the purpose of explaining as well as testing (this way you've fewer chances of outdated documentation). Similar functionality is very complex to replicate in Java.
Community
Python's community has been proliferating due to its suitability in domains beyond web applications (like Data Analytics, Image Processing, Machine Learning, and more). According to Github's Octoverse, Python was the second most used language on Github, followed by Java. In Stackoverflow's 2019 developer survey, Python was crowned the fastest-growing programming language edging out Java this year.
Who's using Java and Python in web development?
Below is a list of well-known companies that use Java in web development:
And here is a list of companies that use Python in web development:
Conclusion
In this article, we discussed the differences between Java and Python. We can safely say that both of these languages suitable server-side web-development. If you're about to build a very "enterprisey" web application where performance and security are critical, then Java still has the upper hand despite Python's fast-growing ecosystem. On the other hand, if you have experienced Python developers and care more about developer productivity, or have to deal with things like extensive number crunching, image processing, analytics, then Python has the edge over Java.