Key Terms and Concepts in Scalability

Performance tuning.- This step would consist of refactoring a web application’s source code, analyzing a web application’s configuration settings, attempting to further parallelize a web application’s logic, implementing caching strategies, detecting hot spots and another series of often invasive — code wise that is — procedures throughout a web application’s tiers. These topics will be detailed as the book progresses.

Vertical Scaling: In scaling terminology, this implies that a box or node on which a web application is running can be upwardly equipped with more CPU, Memory, Bandwidth or I/O capacity. Thus if a web application encounters a greater demand for any of these resources, and you are able to move a web application or one of its tiers to a box or node with greater capacity, you will have vertically scaled an application.

Horizontal Scaling: Horizontal scaling refers to assigning resources on a lateral basis. In scaling terminology, this implies that the node on which a web application is running cannot be equipped with more CPU, Memory, Bandwidth, or I/O capacity and thus a web application is split to run on multiple boxes or nodes.
The process of horizontally scaling a web application is more elaborate than that of vertically scaling. The reason is that by relying on horizontal scaling, you need to devise a way to decouple a web application. The simplest way to decouple a web application is by means of its tiers. Recapping from the previous paragraphs, these tiers would be:

Static content tier.- Consists of static images or other resources that make up an application’s interface and don’t need processing. In most web applications this would be JPEG/GIF images, Javascript libraries, Cascading style sheets or pre-built HTML pages.
Business logic tier.- Consists of a web framework for processing data provided by users or stored in the permanent storage tier. Depending on your preferences, this could be Ruby on Rails, PHP Cake, Django, Grails(Java) or MonoRail(.NET) among many others.
Permanent storage tier.- Consists of a permanent storage location for data. In most web applications this is a relational database system(RDBMS).

What are B-Trees

a B-tree is a self-balancing tree data structure that keeps data sorted and allows searches, sequential access, insertions, and deletions in logarithmic time. The B-tree is a generalization of a binary search tree in that a node can have more than two children (Comer 1979, p. 123). Unlike self-balancing binary search trees, the B-tree is optimized for systems that read and write large blocks of data. B-trees are a good example of a data structure for external memory. It is commonly used in databases and filesystems.

All terminal nodes are the same distance from the base

Features of Object Oriented Programming

Encapsulation is a mechanism by which you restrict the access to some of the object’s components, as well as binding the data and methods operating on the data. Encapsulation is a way to obtain ‘information hiding’.

Abstraction is the process of moving from a specific idea to a more general one. For example take your mouse. “Silver Logitech MX518” is a concrete, specific item, and “mouse” is an abstraction of that.


Displaying multiple behavior. Achieved via virtual functions, operator overloading, function overloading, etc

Virtual Function: Without “virtual” you get “early binding”. Which implementation of the method is used gets decided at compile time based on the type of the pointer that you call through. With “virtual” you get “late binding”.

Virtual Constructor:

Stack vs Heap

Stack is used for static memory allocation and Heap for dynamic memory allocation, both stored in the computer’s RAM .

Variables allocated on the stack are stored directly to the memory and access to this memory is very fast, and it’s allocation is dealt with when the program is compiled. The stack is always reserved in a LIFO(Last in first out) order, the most recently reserved block is always the next block to be freed. This makes it really simple to keep track of the stack, freeing a block from the stack is nothing more than adjusting one pointer. You can use the stack if you know exactly how much data you need to allocate before compile time and it is not too big. The stack is attached to a thread, so when the thread exits the stack is reclaimed.

When a function is called, a block is reserved on the top of the stack for local variables and some bookkeeping data. When that function returns, the block becomes unused and can be used the next time a function is called.

int x=1;
int x=y;

Variables allocated on the heap have their memory allocated at run time and accessing this memory is a bit slower, but the heap size is only limited by the size of virtual memory . Element of the heap have no dependencies with each other and can always be accessed randomly at any time. You can allocate a block at any time and free it at any time. This makes it much more complex to keep track of which parts of the heap are allocated or free at any given time.The heap is typically allocated at application startup by the runtime, and is reclaimed when the application (technically process) exits. Heap can be Responsible for memory leaks.

obj1 = new Student();
obj1.prop1 ;

In a multi-threaded situation each thread will have its own completely independent stack but they will share the heap.

Functional vs Procedural vs Object Oriented Programming

In a purely procedural style, data tends to be highly decoupled from the functions that operate on it. Procedural is a type of imperative programming style. An imperative language uses a sequence of statements to determine how to reach a certain goal. These statements are said to change the state of the program as each one is executed in turn.

In an object oriented style, data tends to carry with it a collection of functions.

Functional programming (often abbreviated FP) is the process of building software by composing pure functions, avoiding shared state, mutable data, and side-effects. A function is called ‘pure’ if all its inputs are declared as inputs – none of them are hidden – and likewise all its outputs are declared as outputs. Functional programming is about writing pure functions, about removing hidden inputs and outputs as far as we can, so that as much of our code as possible just describes a relationship between inputs and outputs. Let’s not hide what a piece of code needs, nor what results it will yield. If a piece of code needs something to run correctly, let it say so. If it does something useful, let it declare it as an output. When we do this, our code will be clearer. Complexity will come to the surface, where we can break it down and deal with it.

Explain FLUX

Flux is an architectural pattern that enforces unidirectional data flow.

MVC vs Flux:
MVC did not scale well for Facebook’s huge codebase. The main problem for them was the bidirectional communication, where one change can loop back and have cascading effects across the codebase (making things very complicated to debug and understand).

How does Flux solve this? By forcing an unidirectional flow of data between a system’s components.
In general the flow inside the MVC pattern is not well defined. A lot of the bigger implementations do it very differently (e.g. Cocoa MVC vs. Ruby on Rails MVC).
Flux on the other hand is all about controlling the flow inside the app?—?and making it as simple to understand as possible.

Action –> Dispatcher –> Store –> View

Facebook found that two-way data bindings led to cascading updates, where changing one object led to another object changing, which could also trigger more updates. As applications grew, these cascading updates made it very difficult to predict what would change as the result of one user interaction. When updates can only change data within a single round, the system as a whole becomes more predictable.