The Learnings of a QA engineer

Latest

Swap Space in Linux

What is swap?

Swap area is hard drive space that is reserved to act as extra RAM for when your computer needs more RAM than what is available. Note that when this happens your computer might slow down noticeably.

How to create it?

Most likely, you created a swap partition during the initial Red Hat setup. You can verify the amount of swap space available on your system using:

cat /proc/meminfo

The general recommendation is that one should have: at least 4 MB swap space, at least 32 MB total (physical+swap) memory for a system running command-line-only, at least 64 MB of total (physical+swap) memory for a system running X-windows, and swap space at least 1.5 times the amount of the physical memory on the system.

If this is too complicated, you might want to have a swap twice as large as your physical (silicon) memory, but not less than 64 MB.

If you ever need to change your swap, here are some basics.

How many swap partitions can I have?

You can have several swap partitions. Older Linux kernels limit the size of each swap partition to up to approximately 124 MB, but the Linux kernels 2.2.x up do not have this restriction.

Steps to create and enable a swap partition:

  • Create the partition of the proper size using fdisk
  • Format the partition checking for bad blocks, for example:

mkswap -c /dev/hda4

  • You have to substitute /dev/hda4 with your partition name. Since I did not specify the partition size, it will be automatically detected.

Enable the swap, for example:

swapon /dev/hda4

  • To have the swap enabled automatically at boot up, you have to include the appropriate entry into the file /etc/fstab, for example:

/dev/hda4 swap swap defaults 0 0

  • If you ever need to disable the swap, you can do it with (as root):

swapoff /dev/hda4

Steps to create and enable a swap file:

Swapping to files is usually slower than swapping to a raw partition, so this is not the recommended permanent swapping technique.

Creating a swap file, however, can be a quick fix if you temporarily need more swap space.

You can have up to 8 swap files, each with size of up to 16 MB.

Here are the steps for making a swap file:

  • Create a file with the size of your swap file:

dd if=/dev/zero of=/swapfile bs=1024 count=8192

This physically creates the swap file /swapfile, the block size is 1024 bytes, the file contains 8192 blocks, the total size is about 8 MB. [The dd command copies files. In the example above, the input file (if) was /dev/zero, the output file (of) was /swapfile .]

  • Set up the file with the command:

mkswap /swapfile

  • Force writing the buffer cache to disk by issuing the command:

sync

  • Enable the swap with the command:

swapon /swapfile

  • When you are done using the swap file, you can turn it off and remove:

swapoff /swapfile

rm /swapfile

Identify Establishment Code Of A PF Account

In order to help the people who are not able to identify establishment code of their PF account, here is the information for you. Establish Code is an unique number assigned to your organization by Government of India and Extension Number is for the branch of your organisation if any.Establishment code is part of your PF account number and it’s a 5 digit code. Here is a simple illustration that highlights(underlined in red) establishment codes of two PF accounts.

Image

For example if your Bangalore PF account number is KN/46753/987 then establishment code of your PF account is 46753.

Hope this article helps you to easily identify establishment code of your PF account

compendium of DBMS

Data

Every organization has infornmation needs. A library maintains a list of members, books, due dates, and fines. A company maintains information about employees, departments, and salaries. These pieces of information are called data.

Data storage on different media

Organizations can store data on various media and in different formats. For example, a hardcopy document can be stored in a filing cabinet. In addition, you can store data in an electronic spreadsheet or in a database.

Database

Coherent collection of data records is called database. in other words, database is an organized collection of information. The benefit of storing information in database is that data becomes easy to access and manage.

in layman terms Database is like an almirah with different compartments with specific labels on each one of them. Whenever you want to take out or put in something, you first search the compartment and then take out or put in required things from/to the compartments.

Advantages of using database

  • Reduction in data redundancy
  • Increased consistency
  • More data integrity
  • Independence from applications programs i.e., data independence
  • Data security and crash recovery
  • Improvement in data access by using host and query languages
  • Conacurrent data access

Disadvantages of using database

  • They are complex and difficult to use
  • Time-consuming to design
  • Start-up costs regarding hardware and software
  • requirs a deep technical knowhow and Initial training required for all programmers and users
  • Extensive conversion costs in moving form a file-based system to a database system
  • A Damage to database will affect all applications programs so there is a need for backup

Database Management

Doing operations like insertion, deletion and updating is called managing the database.

Database Management System (DBMS)

To manage data efficiently we need a database management system. It is a system that stores, modifies and retrieves data in the database on request. It is a collection of programs that allow users to work on large data without any problems.

Database Types

based on the way in which data is stored, databses are divided into different categories. they are Hierarchical, network, relational and more recently object relational.

The databases that use the tree format storage and contains one parent are called hierarchical database management systems. The databases that uses the tree format storage and contains more than one parent are called network database management systems. The databases that use the table format storage are called relational database management systems (RDBMS).The databases that use the object type format storage are called object relational database management systems (ORDBMS).

Before 1970, hierarchical and network data structures were more popular models for information storage. These models depend on the structure of the data and the way information is linked in the hierarchy or network.

based on its distribution also database is divided into three types. they are Centralized, Distributed and Homogeneous databases.

If the data is stored at a single computer site then it is called centralized database.

If the database and DBMS software are distributed over many sites then the database system is called distributed database system.

If the database system uses the same DBMS software at multiple sites then the database system is called homogeneous database system.

Definition of relational database

A relational database is a collection of relations or two-dimentional tables. A relational database uses these two-dimensional tables to store the data.

Components of the relational datbase model

The relational database model mainly consists of three components. They are a collection of database objects or relations, set of operations, and data integrity rules. The tables are the objects in the relational database model.

DBMS Systems in use

There are innumerable numbers of DBMS software available in the market. Some of the most popular ones include Oracle, IBM’s DB2, Microsoft Access, Microsoft SQL Server, MySQL, Sybase, PostGreSql,Predator and FileMaker. MySQL, one of the most popular open source database management systems used by online entrepreneurs is one example of an object-oriented DBMS. Microsoft Access (another popular DBMS) on the other hand is not a fully object oriented system, even though it does flaunt certain aspects of it.

Chronicle of Unix

Situation of Computer science and OS before Unix

Computer systems didn’t talk to each other in the early days of computing. Even the various computer lines made by the same company often needed interpreters. And forget any interoperability of systems by different vendors!

In addition, operating systems very often performed only limited tasks, and only on the machines for which they were written. If a business upgraded to a bigger, more powerful computer, the old operating system probably wouldn’t work on the new computer, and often the company’s data had to be entered — again — into the new machine.

The first step towards UNIX–Multics


To try to develop a convenient, interactive, useable computer system that could support many users, a group of computer scientists from Bell Labs and GE in 1965 joined an effort underway at MIT on what was called the Multics (Multiplexed Information and Computing Service) mainframe timesharing system.


Over time, hope was replaced by frustration as the group effort initially failed to produce an economically useful system. Bell Labs withdrew from the effort in 1969 but a small band of users at Bell Labs Computing Science Research Center continued to seek the Holy Grail.

How Multics life ended

Actually, Multics worked, and eventually became a product, but not initially on the scale its developers wanted. Multics could not then support many users and in 1969, ken Thompson began trying to find an alternative to Multics. Thompson decided to satisfy two urges: to write an operating system of his own, and to create an environment in which to do future work.

The First Machine


It did not take long, therefore, for Thompson to find a little-used PDP-7 computer with an excellent display processor. They borrowed it from a company for writing operating system. The configuration of PDP-7 was too high at that time. It has got a RAM of 1kb and 40 kb of memory. Instead of keyboard it used to have a type writer. The cost of the machine is also not affordable and it was around 70000 dollars at that time. It was 10 times the present version of normal cell phones.

The Development of Unix

Dennis Ritchie insisted Thompson using FORTRAN to write the Operating system but within 4 hours of started writing they gave up. The main reason for this is because of the non-availability of the data structures in FORTRAN language. Then he wrote a very simple language called B, which he got going on the PDP-7.


It worked, but there were problems. First, because the implementation was interpreted, it was always going to be slow. Finally, they decided to use C to write the operating system they wanted. In 1971, the new operating system using C came to reality. Novices to the system started experimenting by linking different commands together for what they thought should be the output. Joe Condon, the owner of the PDP-7 that Thompson first used for UNIX, started using UNIX himself.

Manual of Unix

Most manuals are often an afterthought, something cobbled together after the product is made. The UNIX manual, in contrast, reflected the philosophy of UNIX in design and content. It even told you where the bugs were. The manual style initially was set by Ritchie, but McIlroy soon took over its compilation. From then on they started distributing the operating system for the needy.

Introduction of New Unix Flavors

In 1971, government introduced a law saying the companies can’t distribute the software they developed and they also demanded that they should be used for the official use only.

Thompson was very upset with the government regulation and took a sabbatical break of 6 months from BELL and started teaching UNIX system as a visiting professor at the Computer Science Department of University of California-Berkeley (UCB). While there, he also developed a new flavor of UNIX called BSD software (Berkeley Software Distribution). It is known for its high load carrying capacity. Other two Open BSD and Net BSD are known for their security and portability respectively.


Symbolics and its End


In 1980’s most of the coders used to work in AI lab of Massachusetts Institute of Technology (MIT). Till date they are considered as the most efficient and good programmers. They wrote many softwares and distributed them to various people. Then suddenly one day a company called Symbolics came to the scene and hired all of them with huge packages and perks. This the first dot come company in the world. All the good programmers started working for Symbolics and within 6 months they stopped sharing codes because of the company policies.
Symbolics people were able to hire every programmer from UCB except one. He is Richard Stallman Mathews. He was upset with the Symbolics Company’s strategy and started producing all the software that Symbolics made and selling with huge price. He distributed those softwares for free. Because of that in 6 months Symbolics was bankrupt.

GNU and GPL

Then he stopped troubling Symbolix and started an organization called GNU pronounced as ganu. The full form of it is GNU is not Unix. With the help of this organization he produced softwares and gave away for free. He also made a license called GPL i.e., general public license. The symbol for it is copy left. According to the GPL, once you buy a software you are free to do anything with it. Only condition is that if u want to give it to anyone you have give it without hiding the source code and for free.


Kernel

Richard Stallman had everything from GCC Compiler to Emacs Editor, but he did not have the kernel. He started writing a kernel named Hurd on his own. At the same point of time a 21 year old guy called Linux Torvalds wrote a kernel on his own. People suggested Stallman to use that kernel, instead of writing a new one. So he used it and formed GNU LINUX.

kernel is the central component of most computer operating systems; it is a bridge between applications and the actual data processing done at the hardware level. The kernel’s responsibilities include managing the system’s resources.


Main parts of any OS are user space and kernel space. In GNU Linux, GNU is user space and Linux is the kernel space. In this 20 to 30 MB is the KERNEL and the rest is the GNU. In this combination one replace either GNU or LINUX with the appropriate one. Some companies use GNU without Linux and some people use Linux without GNU.

Kernel is secret thing and Shell covers it and protects it.

Shell


A shell in Unix is a command-line interpreter and script host that provides a traditional user interface for the Unix operating system and for Unix-like systems. Users direct the operation of the computer by entering command input as text for a command line interpreter to execute or by creating text scripts of one or more such commands. Kernel takes that command and interacts with the hardware and software to respond to the given command.

Is it necessary?

There is no need of shell for the OS that does not have any user interface. The shell used by MSDOS is command.exe and the shell name of Windows is explorer.exe

Development of shells

The first shell developed was Thompson shell. It was written by him and he is the only guy to use it. It was termed as an ugly shell by the people.

Then a guy named Steve Bourne developed a shell named bourn shell. It is commonly called as sh. It was good for programming but not for simple use.

In the meantime one more guy named bill joy developed a shell called CShell. It is commonly called as csh. He is the author of vi-editor and as well as the co-founder of sun micro systems. csh was not programmer friendly.

The K shell is the shell developed by David Korn from AT&T Bell Laboratories. People think that Korn shell and kshell are same. But there is difference. The difference is kshell is the tool and Korn shell is the language used in kshell. It is similar to C and TourboC, where TourboC is the tool and c is the language. The disadvantage of kshell is its cost. It is too costly.

Bash is the shell developed by brain fox. It is also called as born again shell. It is for free.

Shell scripting

Shell Script is series of command written in plain text file. Shell script is just like batch file is MS-DOS but have more power than the MS-DOS batch file.

Why shell scripts

  • Shell script can take input from user, file and output them on screen.
  • Useful to create our own commands.
  • Save lots of time.
  • To automate some task of day today life.
  • System Administration part can be also automated.

Advantages of shell scripting

  • Commands can be used directly
  • Length of a typical shell script is one tenth of a C Program meant for the same cause
  • It is easy to learn
  • It has got a very easy learning curve

Disadvantages of shell scripting

  • In shell scripts we use commands that means each time we use a command it calls an external command and the execution becomes slow
  • Shell scripts cannot connect to the database directly. They need some third party libraries.

Shell that can be used for scripting

Tcsh is the advanced version of csh. Bash, sh, ksh are very handy in writing shell scripts where as csh, tcsh are not that much useful because of non-powerful loops and aliases. They are not shell scripting shells.

Various flavors of UNIX that are in use

  • Aix from IBM
  • Solaris from Sun
  • HP-UX from HP
  • SCO from SCO Inc.
  • Digital Unix from digital Corporation
  • True 64 from Compact
  • Irix from Silicon Graphics
  • MacOS from Apple

C-concepts: Part1

declaration and definition of a variable/function

Declaration of a variable/function simply declares that the variable/function exists somewhere in the program but the memory is not allocated for them. But the declaration of a variable/function serves an important role. And that is the type of the variable/function. Therefore, when a variable is declared, the program knows the data type of that variable. In case of function declaration, the program knows what are the arguments to that functions, their data types, the order of arguments and the return type of the function. So that’s all about declaration. Coming to the definition, when we define a variable/function, apart from the role of declaration, it also allocates memory for that variable/function. Therefore, we can think of definition as a super set of declaration. (or declaration as a subset of definition). From this explanation, it should be obvious that a variable/function can be declared any number of times but it can be defined only once. (Remember the basic principle that you can’t have two locations of the same variable/function). So that’s all about declaration and definition.

Understanding “extern” keyword in C

it’s mandatory to understand declaration/defination to understand the “extern” keyword. Let us first take the easy case. Use of extern with C functions. By default, the declaration and definition of a C function have “extern” prepended with them. It means even though we don’t use extern with the declaration/definition of C functions, it is present there. For example, when we write.

int foo(int arg1, char arg2);

There’s an extern present in the beginning which is hidden and the compiler treats it as below.

extern int foo(int arg1, char arg2);

Same is the case with the definition of a C function (Definition of a C function means writing the body of the function). Therefore whenever we define a C function, an extern is present there in the beginning of the function definition. Since the declaration can be done any number of times and definition can be done only once, we can notice that declaration of a function can be added in several C/H files or in a single C/H file several times. But we notice the actual definition of the function only once (i.e. in one file only). And as the extern extends the visibility to the whole program, the functions can be used (called) anywhere in any of the files of the whole program provided the declaration of the function is known. (By knowing the declaration of the function, C compiler knows that the definition of the function exists and it goes ahead to compile the program). So that’s all about extern with C functions.

Now let us the take the second and final case i.e. use of extern with C variables. I feel that it more interesting and information than the previous case where extern is present by default with C functions. So let me ask the question, how would you declare a C variable without defining it? Many of you would see it trivial but it’s important question to understand extern with C variables. The answer goes as follows.

extern int var;

Here, an integer type variable called var has been declared (remember no definition i.e. no memory allocation for var so far). And we can do this declaration as many times as needed. (remember that declaration can be done any number of times) So far so good. :)

Now how would you define a variable. Now I agree that it is the most trivial question in programming and the answer is as follows.

int var;

Here, an integer type variable called var has been declared as well as defined. (remember that definition is the super set of declaration). Here the memory for var is also allocated. Now here comes the surprise, when we declared/defined a C function, we saw that an extern was present by default. While defining a function, we can prepend it with extern without any issues. But it is not the case with C variables. If we put the presence of extern in variable as default then the memory for them will not be allocated ever, they will be declared only. Therefore, we put extern explicitly for C variables when we want to declare them without defining them. Also, as the extern extends the visibility to the whole program, by externing a variable we can use the variables anywhere in the program provided we know the declaration of them and the variable is defined somewhere.

Now let us try to understand extern with examples.

Example 1:
int var;
int main(void)
{
var = 10;
return 0;
}

Analysis: This program is compiled successfully. Here var is defined (and declared implicitly) globally.

Example 2:
extern int var;
int main(void)
{
return 0;
}

Analysis: This program is compiled successfully. Here var is declared only. Notice var is never used so no problems.

Example 3:
extern int var;
int main(void)
{
var = 10;
return 0;
}

Analysis: This program throws error in compilation. Because var is declared but not defined anywhere. Essentially, the var isn’t allocated any memory. And the program is trying to change the value to 10 of a variable that doesn’t exist at all.

Example 4:
#include “somefile.h”
extern int var;
int main(void)
{
var = 10;
return 0;
}

Analysis: Supposing that somefile.h has the definition of var. This program will be compiled successfully.

Example 5:
extern int var = 0;
int main(void)
{
var = 10;
return 0;
}

Analysis: Guess this program will work? Well, here comes another surprise from C standards. They say that..if a variable is only declared and an initializer is also provided with that declaration, then the memory for that variable will be allocated i.e. that variable will be considered as defined. Therefore, as per the C standard, this program will compile successfully and work.

So that was a preliminary look at “extern” keyword in C.

I’m sure that you want to have some take away from the reading of this post. And I would not disappoint you. :)
In short, we can say

1. Declaration can be done any number of times but definition only once.
2. “extern” keyword is used to extend the visibility of variables/functions().
3. Since functions are visible through out the program by default. The use of extern is not needed in function declaration/definition. Its use is redundant.
4. When extern is used with a variable, it’s only declared not defined.
5. As an exception, when an extern variable is declared with initialization, it is taken as definition of the variable as well.

Mystery of Bigendian and Littleendian

What are these?

Little and big endian are two ways of storing multibyte data-types ( int, float, etc). In little endian machines, last byte of binary representation of the multibyte data-type is stored first. On the other hand, in big endian machines, first byte of binary representation of the multibyte data-type is stored last.

Suppose integer is stored as 4 bytes (For those who are using DOS based compilers such as C++ 3.0 , integer is 2 bytes) then a variable x with value 0×01234567 will be stored as following.

How to see memory representation of multibyte data types on your machine?

Here is a sample C code that shows the byte representation of int, float and pointer.
#include

/* function to show bytes in memory, from location start to start+n*/
void show_mem_rep(char *start, int n)
{
int i;
for (i = 0; i < n; i++)
printf(" %.2x", start[i]);
printf("\n");
}

/*Main function to call above function for 0×01234567*/
int main()
{
int i = 0×01234567;
show_mem_rep((char *)&i, sizeof(i));
getchar();
return 0;
}

When above program is run on little endian machine, gives “67 45 23 01″ as output , while if it is run on endian machine, gives “01 23 45 67″ as output.

Is there a quick way to determine endianness of your machine?

There are n no. of ways for determining endianness of your machine. Here is one quick way of doing the same.
#include
int main()
{
unsigned int i = 1;
char *c = (char*)&i;
if (*c)
printf(“Little endian”);
else
printf(“Big endian”);
getchar();
return 0;
}

In the above program, a character pointer c is pointing to an integer i. Since size of character is 1 byte when the character pointer is de-referenced it will contain only first byte of integer. If machine is little endian then *c will be 1 (because last byte is stored first) and if machine is big endian then *c will be 0.

Does endianness matter for programmers?

Most of the times compiler takes care of endianness, however, endianness becomes an issue in following cases.

It matters in network programming: Suppose you write integers to file on a little endian machine and you transfer this file to a big endian machine. Unless there is little andian to big endian transformation, big endian machine will read the file in reverse order. You can find such a practical example here.

Standard byte order for networks is big endian, also known as network byte order. Before transferring data on network, data is first converted to network byte order (big endian).

Sometimes it matters when you are using type casting, below program is an example.
#include
int main()
{
unsigned char arr[2] = {0×01, 0×00};
unsigned short int x = *(unsigned short int *) arr;
printf(“%d”, x);
getchar();
return 0;
}

In the above program, a char array is typecasted to an unsigned short integer type. When I run above program on little endian machine, I get 1 as output, while if I run it on a big endian machine I get 512. To make programs endianness independent, above programming style should be avoided.

What are bi-endians?

Bi-endian processors can run in both modes little and big endian.

What are the examples of little, big endian and bi-endian machines ?
Intel based processors are little endians. ARM processors were little endians. Current generation ARM processors are bi-endian.

Motorola 68K processors are big endians. PowerPC (by Motorola) and SPARK (by Sun) processors were big endian. Current version of these processors are bi-endians.

Does endianness effects file formats?

File formats which have 1 byte as a basic unit are independent of endianness e..g., ASCII files . Other file formats use some fixed endianness forrmat e.g, JPEG files are stored in big endian format.

Which one is better — little endian or big endian

The term little and big endian came from Gulliver’s Travels by Jonathan Swift. Two groups could not agree by which end a egg should be opened -a-the little or the big. Just like the egg issue, there is no technological reason to choose one byte ordering convention over the other, hence the arguments degenerate into bickering about sociopolitical issues. As long as one of the conventions is selected and adhered to consistently, the choice is arbitrary.

Memory Areas in C Language

When we come across memory segments in C program these are the questions that comes to our mind.

  • What happens when a c program is loaded into memory?
  • Where are the different types of variables allocated?
  • Why do we need two data sections, initialized and un-initialized?
  • If we initialize a static or global variable with 0 where will it be stored?


Even though the scope of global and static variables are different, why are they stored in same section i.e., data segment?

Let’s look at some of these interesting under hood details here. We know that a C program which is compiled to an executable and loaded into memory for execution has 4 main segments in memory. They are data, code, stack, and heap segments.

cmemory003
Memorylayout

Global and function static variables are allocated in the data segment. The compiler converts the executable statements in C program such as printf(“hello world”); into machine code. They are loaded in the code segment. When the program executes, function calls are made. Executing each function requires allocation of memory, as if in a frame to store different information like the return pointer, local variable…etc. since this allocation is done in the stack, these are known as stack frames. When we do dynamic memory allocation, such as the use of the malloc function, memory is allocated in the heap area.
Static and Dynamic Segments

ccompilerlinker006

The data and code segments are of fixed size. When a program is compiled, at that point itself, the sizes required for the segments are fixed and known. Hence they are known as static segments. The sizes of the stack and heap areas are not known when the program gets compiled. Also it is possible to change or configure the sizes of these areas (i.e., increase or decrease). So, these are called dynamic segments.
Let’s look at each of these segments in detail.

Data segment:- the data segment is to hold the value of those variables that need to be available throughout the life time of the program. So it is obvious that global variables should be allocated in the data segment. How about local variables declared as static? Yes, they are also allocated in the data area because their values should be available across function calls. If they are allocated in the stack frame itself, they will get destroyed once the function returns. The only option is to allocate them in a global area. Hence, they are allocated in this segment. So, the life time of a local static variable is that of the life time of the program.

There are two parts in this segment. The initialized data segment and u-initialized data segment.
When variables are initialized to some value (other than 0 or which is different value), they are allocated in the initialized segment. When the variables are un initialized they get allocated in the un-initialized data segment. This segment is usually referred to with cryptic acronym called BSS. It stands for block starting with symbol and gets its name from old IBM systems which had that segments initialized to zero.
The data area is separated into two based on explicit initialization, because the variables that are need to be initialized need not be initialized with zeros one by one. However the variables that are not initialized need not to be explicitly initialized with zeros one by one. Instead the job of initialization of variables to zero is left to the operating system to take care of. This bulk initialization can greatly reduce the time required to load.

When we want to run an executable program, the OS starts a program known as loader. When this loads the file into memory, it takes the BSS segment and initializes the whole thing to zeros. That is why the un-initialized global data and static data always get the default value of zero.

The layout of data segment is in the control of the underlying OS. However some loaders give partial control to the users. This information may be useful in applications such as embedded systems.
The data area can be addressed and accessed using pointers from the code. Automatic variables have an overhead in initializing the variables each time they are required, and code is required to do that initialization. However, variables in the data area do not have such runtime overhead, because the initialization is done only once and that too at loading time.

Code segment:- the program code is where the executable code is available for execution. This area is also known as the text segment and is of fixed size. This can be accessed only by function pointers and not by other data pointers. Another important piece of information to take note of here is that the system may consider this area as a read only memory area and any attempt to write in this area can lead to undefined behavior.

Stack and heap segments:- to execute the program two major parts of the memory used are stack and heap. Stack frames area created in the stack for functions and in the heap for dynamic memory allocation. The stack and heap are un-initialized areas. Therefore whatever happens to be in the memory becomes the initial (garbage) value for the objects created in that space.

The local variable and function arguments are allocated in the stack. For the local variables that have an initialization value, code is generated by the compiler to initialize them explicitly to those values when the stack frames are created. For function parameters the compiler generates code to copy the actual arguments to the space allocated for the parameters in the stack frame.

Here, we will take a small program and see where different program elements are stored when that program executes. The comments explain where the variables get stored.

int bss1;
static double bss2;
char *bss3;
// these are stored in initialized to zero segment also known as un-initialized data segment(BSS)
int init1=0;
float init2=10.0f
char *init3=”hello world”;
// these are stored in initialized data segment
// the code for main function gets stored in the code segment
int main
{
int local1=10; // this variable is stored in the stack and initialization code is generated by compiler.
int local2; //this variable is not initialized hence it has garbage value. It does not get initialized to zero.
static int local3; // this is allocated in the BSS segment and gets initialized to zero
static int local4=100; //this gets allocated in initialed data segment
int (*local-foo) (const char* —)= printf; // printf is in a shared library (libc or c runtime library)
// load-foo is a local variable(function pointer) that points to the printf function local-foo(“hello world”); this function call results in the creation of stack frame in stack area
int *loacl5=malloc(sizeof(int));
// allocated in stack however it points to dynamically allocated block in heap.
return 0;
// stack frame for the main function gets destroyed after executing main
}

There several tools to check where a variable gets stored in the memory. But the easy to use tool is nm.

Using nm tool

Gcc program name
nm ./a.out

if no arguments are given to nm, it assumes that it should take the input as a.out and we will get some cryptic output like below.

nmall

Where the symbols that we did not type are come from? They have been inserted behind the screen by compiler for various reasons. We can ignore them for now.

Now what are those strange numbers, followed by letters (b, B, t). The numbers are the symbol values followed by the symbol type (displayed as a letter) and the symbol name.

The symbol type requires more explanation. A lowercase means it is local variable and uppercase means global (externally available from the file).
B un-initialized data section (BSS)
D initialized data section
T text/code section
U un identified

Example 1: nm ./a.out | grep bss

dss

Variables bss1, bss3 got allocated in the BSS segment (global) since we put the class as static for variable bss2, it is listed as b (accessible with in the file).

Example 2: nm ./a.out | grep init

init

These are explicitly initialized and are allocated into initialized data section.

Example 3: nm ./a.out | grep local

local

Only local3 and local4 are allocated global memory. Since local3 is un-initialized it is allocated in the BSS and since local 4 is explicitly initialized it is allocated in the initialized data segment. As both are local they are indicated by small letters. Since they are local to the function and to avoid accidental mixing them up with other local variables with the same name they have been suffixed by some numbers. (Compilers differ in their approaches in treating local static variables. This approach is for gcc ).

08048354 T main
U malloc@@GLIBC_2.0
U printf@@ GLIBC_2.0

The main function is allocated in the text/code segment. Obviously we can access this function from outside the file ( to start the execution ). So the type of this symbol is T.

The malloc and printf function used in the program are not defined in the program itself ( header files only declare them, they don’t define them ). They are defined in the shared library GLIBC, version 2.0. that’s what the suffix @@GLIBC_@.0 implies.

Installing Fedora directly from ISO image

Let’s assume you have downloaded a new version of a distro and are in the mood to try it out right away. It’s past midnight and you realise that you’ve run out of blank CDs/DVDs. So you will have to wait till the morning when the shops open, to be able to burn the distro image in order to install it. I’m sure a lot of us often face this problem. In this article I’ll share a simple trick that i came across on web, by which you can install the new distributions without burning it to a CD/DVD. The only requirement is that you should have a pre-installed GNU/Linux system—which you already have, I assume.

All Linux installers use two files to boot a computer: a kernel and an initial root file system—also known as the RAM disk or initrd image. This initrd image contains a set of executables and drivers that are needed to mount the real root file system. When the real root file system mounts, the initrd is unmounted and its memory is freed. These two files are named differently in different distributions—refer to Table 1 for their names.
Table 1: Names of kernel and RAM disk images in some popular distributions
123

The first thing you need to do is place the ISO image(s) inside a directory. Some installers are not able to read the ISO images if they are placed inside a directory. So, just to be on the safe side, place them in the root of the file system. The partition on the hard disk holding the ISO files must be formatted with the ext2, ext3 or vfat files system.

In our example, let’s go ahead and do it with an old Fedora 9 ISO image. Follow these steps to begin with:

# mkdir /fedora
# cp /home/dayanand/Fedora-9-i386-DVD.iso /fedora/fedora9.iso

Now extract the kernel and initrd files from the ISO image and place them in the same directory in which you placed the ISO. You can use File Roller, the archive manager for GNOME, to extract the files. Just right click on the ISO and select “Open with File Roller”. It displays the contents of the ISO image. Then navigate to the isolinux directory—in Fedora 9 these two files are placed inside the isolinux directory; it’s often different for other distributions, so please refer to Table 1 for the paths. Select the kernel and initrd files, and extract them to the location where your ISO image exists.

The second method is to mount the ISO image and extract the files. Run the following commands to do this:

# mount -o loop /fedora/fedora9.iso /media/iso
# cd /media/iso/isolinux
# cp vmlinuz initrd.img /fedora/

I have mounted the ISO image without providing the -t iso9660 option (to specify the type of media as an ISO filesystem). It worked for me. If the above mount command doesn’t work, do add this option along with the rest of the mount command above.

Note: Fedora 10 has introduced a change in the Anaconda installer. So, in addition to the vmlinuz and initrd.gz files, you will also need to copy the images/install.img file, create a directory called /fedora/images, and place the install.img file there.

Now, it’s time to edit the /boot/grub/menu.lst file on the system I’m currently using—Ubuntu 8.10. Note that this is the location of the Grub menu in almost all distros, except for Fedora/Red Hat, where it’s called /boot/grub/grub.conf. Append the following entry there:

title Install Linux
root (hdX,Y)
kernel /distro/Linux_kernel
initrd /distro/Ram_disk

In this case…

1. ‘title’ is the name you want to display in your GRUB menu
2. ‘root’ is the hard disk partition that contains the ISO image
3. ‘kernel’ is the Linux kernel
4. ‘initrd’ is the initial RAM disk image

Likewise, the menu.lst entry for the ISO file looks like what’s shown below:

title Install Fedora 9
root (hd4,0)
kernel /fedora/vmlinuz
initrd /fedora/initrd.img

Now you are ready to install your new Linux distro directly from the hard disk without the need for a CD/DVD drive. Reboot your system and select the ‘Install Fedora 9’ entry from your GRUB menu.
Figure 1 shows what the GRUB menu looks like after rebooting my system.
install1

Obviously, I selected the ‘Install Fedora 9’ entry and it has started booting my system with the help of vmlinuz and initrd.img files. The set-up prompts me to choose a language and keyboard layout. Then it prompts me to select the ‘Installation Method’ as shown in Figure 2.
install2

In this screen you need to select the ‘Hard drive’ option and proceed to the next screen. Here, you have to select the appropriate partition and the directory where the installation image exists. In my system, the installation image exists in the /fedora directory of /dev/sda5 partition. This is shown in Figure 3.
install3

After this, it picks up the Anaconda installer of Fedora 9 (or any other installer, as in your case) from the prescribed location, and proceeds with the regular installation procedure just like you’d get if you were installing from a bootable optical media. Follow the steps as you would to install the distro. Figure 4 shows the package installation in action. After that’s done, reboot and you’ll be able to use your newly installed operating system.
install4
Easy enough, right? So, I hope you’ll start using this simple trick to install the newly released GNU/Linux distros and stop worrying about whether you have the required blank optical media. And the additional environmental benefit is less use of non-biodegradable plastic materials (which is what a CD/DVD is made out of).

What was MySql root Password!!!!

Many people use the MySQL open source relational database server in web development and educational forgot-passwordinstitutions. One problem that many ran into with my MySQL is forgetting the root password? If this happens we are nothing but locked out of our database server and can’t database server. In such situations knowing how to regain root access to your database server comes in handy. With a quick survey of the internet, I found this very simple and very useful information about how-to Recover lost MySQL root password. It is very well written and easy to follow. Here’s what you can do to reset the password for the root user in MySQL on both Windows and Linux. Restart your MySQL server in and tell it to skip the grants table. The detailed description is given below.

Windows Users:

For mysql servers below 5.0 version

1. Stop your MySQL server completely. This can be done by accessing the Services window inside Windows XP and Windows Server 2003, where you can stop the MySQL service.

2. Open your MS-DOS command prompt using “cmd” inside the Run window. Inside it navigate to your MySQL bin folder, such as C:\MySQL\bin using the cd command.

3. Execute the following command in the command prompt: mysqld.exe -u root –skip-grant-tables

4. Leave the current MS-DOS command prompt as it is, and open a new MS-DOS command prompt window.

5. Navigate to your MySQL bin folder, such as C:\MySQL\bin using the cd command.

6. Enter “mysql” and press enter.

7. You should now have the MySQL command prompt working. Type “use mysql;” so that we switch to the “mysql” database.

8. Execute the following command to update the password:

UPDATE user SET Password = PASSWORD(‘NEW_PASSWORD’) WHERE User = ‘root’;

However, you can now run any SQL command that you wish.

After you are finished close the first command prompt and type “exit;” in the second command prompt windows to disconnect successfully. You can now start the MySQL service.

For mysql servers above 5.0

Log on to your server as the Administrator. Kill the MySQL server if it’s running. To do this you need the Windows Services Manager, so click on the Start Menu, then go to the Control Panel, then to the Administrative Tools, and select Services. Here look for the MySQL server and stop it. If it’s not listed there and MySQL is still running it means that MySQL is not running as a service. In that case you need to load the Task Manager which you should be able to access using the key combination of Ctrl+Alt+Del. Now kill the MySQL process.

With the MySQL process stopped you need to force a change of passwords on MySQL using a combination of the UPDATE and FLUSH options. So launch your favorite text editor and create a new file. Enter the following text into the file replacing “NewMySQLPassword” with your new password:

UPDATE mysql.user SET Password=PASSWORD(”NewMySQLPassword”) WHERE User=’root’;
FLUSH PRIVILEGES;

What the first line does is that it updates the value of the field “Password” in the table mysql.user for the user “root” to “NewMySQLPassword”. The second line flushes the old set of privileges and makes sure your new password is used everywhere. Save this text as C:\mysql_reset.txt.

Next, you need to start your MySQL server passing this file as a configuration parameter. Launch a terminal by going to the Start Menu, then to Run, and then type cmd and hit Enter. Now enter the following command:

C:\mysql\bin\mysqld-nt –init-file=C:\mysql_reset.txt

Once the server is done starting, delete the file C:\mysql_reset.txt. Your MySQL root password should be reset now. Now restart your MySQL server. Go back to the Windows Services Manager again to do that. Your new MySQL root password should work for you now.

Linux Users:

Log on to your Linux machine as the root user. The steps involved in resetting the MySQL root password are to stop the MySQL server, restart it without the permissions active so you can log into MySQL as root without a password, set a new password, and then restart it normally. Here’s how you do it. First, stop the MySQL server:

# /etc/init.d/mysql stop

Now start the MySQL server using the –skip-grant-tables option, which will run the server without loading the permissions settings:

# mysqld_safe –skip-grant-tables &

The & option at the end makes the command you have executed run as a background process. Now log on to your MySQL server as root:

# mysql -u root

It should allow you in without prompting for a password. The following steps will set the new password:

mysql> use mysql;
mysql> update user set password=PASSWORD(”NewMySQLPassword”) where User=’root’;
mysql> flush privileges;
mysql> quit

Replace “NewMySQLPassword” with your own password. Here’s what happens here. The first line selects the MySQL configuration tables. The second line updates the value of the field “Password” for the user “root” to “NewMySQLPassword”. The third line flushes the old set of privileges and makes sure your new password is used everywhere. Now, the last step is to restart the server normally and use your new root password to log in:

# /etc/init.d/mysql stop
# /etc/init.d/mysql start
# mysql -u root -pNewMySQLPassword

Congratulations, your new MySQL root password is set and your MySQL server is ready to be used again. Remember to update all your applications to use this password if you are using it anywhere.

Demystifying the Integer overflow in C language

What is integer overflow and why it occurs?

An integer overflow, or integer wrapping, is a potential problem in a program based upon the fact that the value that can be held in a numeric data type is limited by the data type’s size in bytes. ANSI C uses the following minimum sizes:

data type size (bytes)
char 1
short 2
int 2
long 4

In practice, many compilers use a 4-byte int. It also should be noted that the actual ranges for the data types depend on whether or not they are signed. for instance, a signed 2-byte short may be between -32767 and 32767, while an unsigned short may be between 0 and 65535. See your [include]/limits.h file for specific numbers for your compiler.

Why should you care?

If you try to put a value into a data type that is too small to hold it, the high-order bits are dropped, and only the low-order bits are stored. Another way of saying that is that modulo-arithmetic is performed on the value before storing it to make sure it fits within the data type. Taking our unsigned short example:

Limit: 65535 or 1111 1111 1111 1111
Too big: 65536 or 1 0000 0000 0000 0000
What’s stored: 0 or 0000 0000 0000 0000

As the above makes evident, that result is because the high-order (or left-most) bit of the value that’s too big is dropped. Or you could say that what’s stored is the result of
Stored = value % (limit + 1) or 65536 % (65535 + 1) = 0
In signed data types, the result is a little different and results in some seemingly weird behavior:

Positive limit: 32767 or 0111 1111 1111 1111
Too big: 32768 or 1000 0000 0000 0000
What’s stored: -32768

Why’s that?

It’s because of “2′s compliment,” which is how negative numbers are represented in binary. To make a long story short, the first half of the range (0 thru 0111 1111 1111 1111) is used for positive numbers in order of least to greatest. The second half of the range is then used for negative numbers in order of least to greatest. So, the negative range for a signed 2-byte short is -32768 thru -1, in that order.

When it occurs?

Integer overflow happens because computers use fixed width to represent integers. So which are the operations that result in overflow? Bitwise and logical operations cannot overflow, while cast and arithmetic operations can. For example, ++ and += operators can overflow, whereas && or & operators (or even <> operators) cannot.

Regarding arithmetic operators, it is obvious that operations like addition, subtraction and multiplication can overflow. How about operations like (unary) negation, division and mod (remainder)? For unary negation, -MIN_INT is equal to MIN_INT (and not MAX_INT), so it overflows. Following the same logic, division overflows for the expression (MIN_INT / -1). How about a mod operation? It does not overflow. The only possible overflow case (MIN_INT % -1) is equal to 0 (verify this yourself—the formula for % operator is a % b = a – ((a / b) * b)).

What happens when it occur?

Suppose memory is being allocated based on an unsigned integer data type’s value. If that value has wrapped around, it may be that far too little memory will be made available. Or if a comparison is being made between a signed integer value and some other number, assuming that the former should be less than the latter, if that value has over flown into the negative, the comparison would pass. But are things going to behave the way the programmer intended? Probably not.

Sorting and searching techniques are most important for the programmer and we will use it in most of the programs we write. Among them the most widely used are binary search and quick sort for efficiency. In these algorithms we use mean of two elements. If the elements are integers then the mean is prone to be integer overflow condition. This results in undesirable results or bugs. So, let’s look at some techniques to detect an overflow before it occurs.

For the statement int k = (i + j);
• If i and j are of different signs, it cannot overflow.
• If i and j are of same signs (- or +), it can overflow.
• If i and j are positive integers, then their sign bit is zero. If k is negative, it means its sign bit is 1—it indicates the value of (i + j) is too large to represent in k, so it overflows.
• If i and j are negative integers, then their sign bit is one. If k is positive, it means its sign bit is 0—it indicates that the value of (i + j) is too small to represent in k, so it overflows.

How to check for the overflow?

To check for overflow, we have to provide checks for conditions 3 and 4. Here is the straightforward conversion of these two statements into code. The function isSafeToAdd(), returns true or false, after checking for overflow.

/* Is it safe to add i and j without overflow? Return value 1 indicates there is no overflow; else it is overflow and not safe to add i and j */
int isSafeToAdd(int i, int j)
{
if( (i < 0 && j =0) || (i > 0 && j > 0) && k INT_MAX) or if ((i + j) INT_MAX) || ((i + j) INT_MAX), we can check the condition (i > INT_MAX – j) by moving j to the RHS of the expression. So, the condition in isSafeToAdd can be rewritten as:

if( (i > INT_MAX – j) || (i < INT_MIN – j) )
return 0;

That works! But can we simplify it further? From condition 2, we know that for an overflow to occur, the signs of i and j should be the same. If you notice the conditions in 3 and 4, the sign bit of the result (k) is different from (i and j). Does this strike you as the check that the ^ operator can be used? How about this check:

int k = (i + j);
if( ((i ^ k) & (j ^ k)) < 0)
return 0;

Let us check it. Assume that i and j are positive values and when it overflows, the result k will be negative. Now the condition (i ^ k) will be a negative value—the sign bit of i is 0 and the sign bit of k is 1; so ^ of the sign bit will be 1 and hence the value of the expression (i ^ k) is negative. So is the case for (j ^ k) and when the & of two values is negative; hence, the condition check with < 0 becomes true when there is overflow. When i and j are negative and k is positive, the condition again is < 0 (following the same logic described above).

So, yes, this also works! Though the if condition is not very easy to understand, it is correct and is also an efficient solution!

Some of the puzzling things about C Language

Can you guess why there is no distinct format specifier for ‘double’ in the printf/scanf format string, although it is one of the four basic data types? (Remember we use %lf for printing the double value in printf/scanf; %d is for integers).

Ans: In older versions of C, there was no ‘double’—it was just ‘long float’ type—and that is the reason why it has the format specifier ‘%lf’ (‘%d’ was already in use to indicate signed decimal values). Later, double type was added to indicate that the floating point type might be of ‘double precision’ (IEEE format, 64-bit value). So a format specifier for long float and double was kept the same.

Why is the output file of the C compiler called a.out?

Ans: The a.out stands for ‘assembler.output’ file [2]. The original UNIX was written using an assembler for the PDP-7 machine. The output of the assembler was a fixed file name, which was a.out to indicate that it was the output file from the assembler. No assembly needs to be done in modern compilers; instead, linking and loading of object files is done. However, this tradition continues and the output of cc is by default a.out!

what is the use of volatile keyword in c?

The meaning of volatile is a popular interview question, particularly for freshers and I’ve read articles describing the properties of this keyword as if it had super powers.

The volatile keyword does a very simple job. When a variable is marked as volatile, the programmer is instructing the compiler not to cache this variable in a register but instead to read the value of the variable from memory each and every time the variable is used. That’s it – simple isn’t it?

To illustrate the use of the keyword, consider the following example:

volatile int* vp = SOME_REGISTER_ADDRESS;
for(int i=0; i<100; i++)
foo(*vp);

In this simple example, the pointer vp points to a volatile int. The value of this int is read from memory for each loop iteration. If volatile was not specified then it is likely that the compiler would generate optimized code which would read the value of the int once, temporarily store this in a register and then use the register copy during each iteration.

hope after reading this you reply confidently, when the same question is asked to you in the interview.

Follow

Get every new post delivered to your Inbox.