Network Address Translation


NAT is so well-integrated into the structure of today’s internet that it cannot be omitted even now with IPv6, despite being a temporary solution initially. In this NAT series of posts, I would talk about NAT, its problems and the various solutions and techniques which have been developed pertaining to NAT.

In this post, I would introduce NAT and its problems specifically its disagreeability with UDP and P2P protocols.

Network Address Translation (NAT) came into picture in 90s with the unprecedented increase in the number of required IP addresses which couldn’t be matched with the available ones of IPv4. Using NAT, an entirely private network can be connected to internet via a gateway which would make an entry whenever a system (in the private network) initiates a connection. This entry would help in rerouting the server response back to the intended node. This technique of IP masquerading thus, would conserve global IP addresses by allowing reuse of private ones. One of the basic principle of computer science – reusability – at use.

But what should also be noted is that in this whole process it also ends up breaking the end-to-end connectivity principal of the Internet. And there are a lot of other problems associated with it.

NAT requires the connection to be started from the internal network. So that the mapping entry in the NAT table could be made. In case, an external node tries to connect, NAT gateway wouldn’t know to which internal host is the connection meant for. This implies that server cannot be hosted on the internal network as there would be no way for an interested external host to reach it. (‘NO WAY’ in the pure form of NAT. There are workarounds, of course.)

Not just hosting server, but it is also problematic for clients using certain protocols. For example, FTP. When the server connects back to the client for transferring file on a different port number , even then it will cause a failure.  (For those unfamiliar with FTP mechanism, client connects with the FTP server on port 21 and then a data channel is established by the server on port 20).

Another problem which we encounter with NAT is sending data on top of UDP. Before I begin with relationship between NAT and UDP, it is important to understand a few things about TCP and UDP. In UDP each and every packet is complete in itself; it is not at all associated with packets before or after it. This is unlike TCP. In case of TCP, packets are like a stream with each packet numbered serially; a flow of numbers. (For this reason, TCP is referred to as (data) stream-oriented where as UDP is packet-oriented.) This means that for UDP, there is no need for explicit connection, as such. No connection set-up!

TCP, the more reliable transport protocol has a three-way handshake mechanism with SYN flag indicating initiation of a conversation and FYN flag announcing the end. NAT gateway with the help of SYN and FYN understands the state of conversation and knows when to make an entry and when to delete. This is not the case with UDP. UDP doesn’t need any no specific connection set up phase. Hence, there is no obvious criteria for NAT gateway to realize the connection start and connection end. It is difficult to work UDP and NAT together. (But again there are workarounds.)

Third point of failure are the peer-to-peer protocol. In case when two peers are behind two different NAT gateways, it is impossible to establish a connection. Suppose NAT A host, lets call it HA makes the first move. It will pass through its network gateway, move through external network and reach the NAT B gateway. At this point, the packet would be dropped. NAT B wouldn’t recognize it as there is no corresponding entry for the source IP: source Port. Similarly connection set up would fail for HB as well. Hence, it is “impossible”.

For those interested in knowing NAT problems in more detail and for drawing further patterns among them, refer to RFC 3027.


Creating Zip file in Java

Recently while working on my internship project, I was required to programmatically zip a given number of files. I was surprised to find out that it is as simple as writing to a file.

Here are the steps for creating zip file in Java.

  1. Create a ZipOutputStream object for writing to the zip file. This object is created with the help of FileOutputStream.
  2. Create a FileInputStream for the file to be written into the zip file.
  3. Create a ZipEntry object which would help to put the files into the zip file using the function putNextEntry. That is, all you have to do is read the input file using FileInputStream.

For modularity (read simplicity) purpose, I have created a separate function ‘addToZipFile’ for the steps 2 and 3 above. Also, remember to handle the FileNotFoundException and IOException.

Main module: 

FileOutputStream fos = new FileOutputStream("C:/");
ZipOutputStream zos = new ZipOutputStream(fos);
String fileName1 ="first.txt";  //Don't put whole file path
String fileName2 ="second.txt";


private void addToZipFile(String fileName, ZipOutputStream zos)
throws FileNotFoundException, IOException {
System.out.println("Writing '" + fileName + "' to zip file");

File file = new File(fileName);
FileInputStream fis = new FileInputStream(file);
ZipEntry zipEntry = new ZipEntry(fileName);
byte[] bytes = new byte[1024];
int length;
while ((length = >= 0) {
zos.write(bytes, 0, length);

That’s it. It is so easy.

1. Stackoverflow


Deep Linking

Hello readers!

This post is an introduction to an interesting and emerging technology – Deep linking.

We all have noticed that we can directly link a section of Wikipedia with #. For example, if we wanted a friend to read the Attractions section of Harry Potter on Wikipedia we can directly give the link instead of This technology, with regard to World Wide Web as is the case in example, is called Web Deep Linking.

Web deep linking links directly to a specific and indexed web content on a web site. For HTTP, both the deep links and the link to the primary resource (home web page) are functionally equal. Hence, we can say the deep linking is in-built.

In the above example, Attractions part of the URI is called the fragment identifier which is used for recognizing the exact required portion of the web page.  Fragment identifier in our example is introduced by hash mark “#”. However there are many other ways of identifying the query part such as using ‘?’. Also, it should be understood that the processing of the fragment identifier is completely done on the web browser and not on the web server.

Web deep linking has become de facto standard as it allows to link directly to the specific web page or specific content on a web page. If it hadn’t been for deep linking, we wouldn’t have been able to utilize the browser history and refresh button capabilities to their fullest. In case you post a link on your blog and when a reader clicks the link, he/she is directed to the homepage instead of the “deep” content of site that you wanted the reader to see, then you should know the website lack the Deep linking capability.

Similar to Web deep linking, we have another concept of Mobile deep linking.


Img 1

In the image 1, mobile deep linking (on left) is shown against web deep linking (on right) to highlight the difference.

Mobile deep linking allows the user to open a particular location within a mobile app which is already installed on the device by simply typing a URI on google search. However, the URI used for opening the desired content differs from OS to OS. And developer has to take care to properly configure the app to handle a given URI.



Img 2

Image 2 very nicely brings out the improvement in user experience as a result of use of deep linking.

There is an advanced level of mobile deep linking which allows user to open a specific content on an app even when the app is not installed on the device. It works by first installing the application on the device, passing the intent (android developers would know this terminology; for others it is a kind of message which communicates the operation to be performed at destination) through the app store to the installed app on the device, and then finally opens up the required content on the app. This advanced form of mobile deep linking is known as Deferred deep linking. And as can be seen it provides a much enhanced user experience.


Img 3

Difference in process of mobile deep linking and deferred deep linking is highlighted by the image 3.


Keywords: Web deep linking, Mobile Deep linking, Fragment Identifier, Deferred deep linking


Image Sources:

Img 1.)

Img 2.)

Img 3.)


If you enjoyed, feel free to comment/post on social networks. If you found anything wrong, kindly let me know. If you want to share some useful insight on the topic, please do that. I would love to know more!


Understanding Basics of Lambda Architecture


I came across the Lambda architecture term for the first time while reading a post on Lambda architecture was used by to build an internationalized app for David Guetta’s social online campaign “This One’s For You”. Lambda architecture was mentioned to be a ‘quick and simple’ way of achieving scalability.

Lambda architecture was introduced by Nathan Marz, a renowned personality in big data community for his work on Storm project. The book “Big Data – Principles and Best Practices of Scalable Realtime Data Systems” written by Nathan Marz and James Warren, presents a much deeper understanding of the architecture. Lambda architecture is a data processing architecture or more specifically associated with big data.

Data systems are an integral part of software design. They need to be able to handle really HUGE amounts of data (well, most of the time, atleast the web software solutions); handle in terms of storing and quickly answering to queries. And sometimes, these data systems are required to last longer than the actual application itself! Considering this data systems have to be designed very meticulously so that they are not only reliable but also scalable.

The picture below shows what a lambda architecture is.

1. Lambda Architecture

Lambda architecture seeks to get the best out of both batch processing and stream (or near real-time) processing. Batch processing is simple, more accurate and is not much affected by the issues of consistency and locking. However, it can be annoyingly slow. And it is there then the role of real time computations which are much faster (a lot of time by working on approximations) is understood.

Comprised of a system of three layers – Batch layer, Serving layer and Speed layer – Lambda architecture works by copying and processing data on two layers – batch and speed layer. It is necessary that the data is immutable as will be seen later. The time-stamped data received is simply appended rather than overwriting any previous record.

Also the architecture requires queries to be pre-computed and stored as views. This helps in achieving speed. These views are created by both real time processing and by batch processing. Results from the two types of computations are merged such that real-time views are overwritten by the batch views because the latter is more accurate. Any query can be answered by merging the two types of views.

To have a detailed understanding of the architecture, let’s see each of the three layer individually.

  1. Batch Layer

This layer has basically two functions. First one, is to store raw data as it comes, thereby continuously growing the master data set which is stored as HDFS (Hadoop Distributed File System). Note that store by appending!

Second function of this layer is to compute views using MapReduce. Iterations of MapReduce are carried on for recomputing the views again and again using the complete data set. Since it uses complete data sets, this means that it can fix errors and give highly accurate results.

  1. Serving Layer

This layer stores the computed views from the batch layer, indexes them and makes them available for queries. More appropriately Serving layer is more like QFD (Question Focused Database) as James Kinley calls them.

Cloudera Impala along with Hive Metastore could be utilized for queries from this layer.

  1. Speed Layer

MapReduce by design has high latency and thus not suited for real time computations. This layer creates more real time views using Storm and also exposes these views for queries. These real-time computed views are discarded as soon as more accurate and precise views from batch layer are generated.

How we say it, Lambda Architecture is the best of both worlds. Low latency, high throughput, high accuracy and fault tolerant, it meets the requirements of today’s web – reliable and scalable solutions.


For further reading on the topic, you can go through the following resources:

  2. Semantikoz
  3. Wikipedia
  4. Infoq
  5. Highscalability post mentioned in the beginning
  6. James Kinley’s blog
  9. Dr. dobbs

#Off-topic now

Also, anyone interested to read Nathan Marz’s blog, here you go: His two posts that really inspired me (as you can see) are: 1. You should write even if you have no Readers and 2. Break into Silicon Valley with a Blog


I hope, I have been able to develop curiosity and some understanding about Lambda architecture. If you enjoyed, feel free to comment/post on social networks. If you found anything wrong, kindly let me know. If you want to share some useful insight on the topic, please do that. I would love to know more!

Ok, bye!