Saturday, February 12, 2011

Thread Local Storage

Thread Local Storage (TLS) is another area of memory management that defines memory on a per-thread basis. All threads of a process share its virtual address space, where the static and global variables are shared by all threads in the process, and the local variables of a function are unique to each thread that runs the function. With TLS, you can provide unique data for each thread that the process accesses by using a global index. One thread allocates the index, which can be used by the other threads to retrieve the unique data associated with the index.

The typical scenario in which TLS is used in Windows is within a dynamic-linked library (DLL), but this is not its only possible use. In the case of the DLL scenario, the use of TLS includes the following details:

  • When a DLL attaches to a process, the DLL uses TlsAlloc to allocate a TLS index. The DLL then allocates some dynamic storage to be used exclusively by the initial thread of the process. It uses the TLS index in a call to the TlsSetValue function to store the address in the TLS slot. This concludes the per-thread initialization for the initial thread of the process. The TLS index is stored in a global or static variable of the DLL.

  • Each time the DLL attaches to a new thread of the process, the DLL allocates some dynamic storage for the new thread and uses the TLS index in a call to TlsSetValue to store the address in the TLS slot. This concludes the per-thread initialization for the new thread.

  • Each time an initialized thread makes a DLL call requiring the data in its dynamic storage, the DLL uses the TLS index in a call to TlsGetValue to retrieve the address of the dynamic storage for that thread.

The following functions are used to manage TLS in UNIX and Windows:

  • Allocating memory. In the Windows environment, the TlsAlloc function allocates a TLS index. A TLS index is used by a thread to store and retrieve values that are local to the thread. The minimum number of indexes available to each process is defined by TLS_MINIMUM_AVAILABLE. TLS indexes are not valid across process boundaries. The prototype of the function is as follows:

    DWORD TlsAlloc(void);

    In the UNIX environment, pthread_key_create creates a thread-specific data key visible to all the threads in the process. Upon key creation, the value NULL is associated with the new key in all active threads. An optional destructor function may be associated with each key value. The prototype of the function is as follows:

    Note: The line has been split into multiple lines for readability.However, while trying it out on a system you must enter it as one line without breaks.

    int pthread_key_create(pthread_key_t *key,
  •                        void   (*destructor, void*));
  • Deleting memory. In the Windows environment, TlsFree releases a TLS index. This, however, does not release the data allocated and set in the TLS index slot. The prototype of the function is as follows:

    BOOL TlsFree(DWORD dwTlsIndex);

    In the UNIX environment, the pthread_key_delete function deletes the thread-specific data key.

    int pthread_key_delete(pthread_key_t key);
  • Storing a value. In the Windows environment, the TlsSetValue function stores memory in a TLS index. The prototype of the function is as follows:

    BOOL TlsSetValue(DWORD dwTlsIndex, LPVOID lpTlsValue);

    In the UNIX environment, the pthread_setspecific function associates a thread-specific value with the key. The prototype of the function is as follows:

    int pthread_setspecific(pthread_key_t key, const void *value);

  • Retrieving a value. In the Windows environment, the TlsGetValue function returns a memory element stored in a specified TLS index. The prototype of the function is as follows:

    LPVOID TlsGetValue(DWORD dwTlsIndex);

    In the UNIX environment, the pthread_getspecific function returns the values currently bound to the specific key. The prototype of the function is as follows:

    void *pthread_getspecific(pthread_key_t key);

Note Additional information is available at
http://msdn.microsoft.com/library/en-us/winprog/winprog/windows_api_reference.asp.

Thread Local Storage (TLS) Example

The following section shows a portion of an example application. It illustrates allocation and access to a memory space on a per-thread basis. First, the main thread of the process allocates a memory slot. The memory slot is then accessed and modified by a child thread. If several instances of the thread are active, each thread procedure will have a unique TLS index value to ensure the separation and isolation of data and state.

UNIX example: Using TLS with pthread library

#include  #include  #include  int main(void) 
{
pthread_key_t k1;
int ret;
int val = 100;
int *pval = NULL;
ret = pthread_key_create(&k1, NULL);
if (ret)
{
printf("pthread_key_create: %s\n", strerror(ret));
return -1;
}
ret = pthread_setspecific(k1, (void *) &val);
if (ret)
{
printf("pthread_setspecific: %s\n",
strerror(ret));
return -2;
}
pval = (int*)pthread_getspecific(k1);
if(pval == NULL)
{
printf("pthred_getspecific: NULL returned");
return -3;
}
printf("pthread_getspecific value: %d\n",*pval);
ret = pthread_key_delete(k1);
if (ret)
{
printf("pthread_key_delete: %s\n", strerror(ret));
return -4;
}
return 0;
}

Wednesday, February 2, 2011

Recursion ( Tool for Lazy Developer)

Lets take an example : find the sum all the n elements in an int array A.

Now I ask a very simple question to A, Can you please tell me the
sum of all the elements inside you?

A says, of course yes..

int contractor=0;
for(int i = 0; i < n; ++i )
contractor += i;

What did A do here?
A actually hired a contractor who goes to each of A's element, collects the data and sums it up.

Another day A was asked the same question.
Since he was feeling lazy to hire the contractor he just hands over this task to A1. A1 too being lazy announces that I don't know the solution but if A2 can give me the sum of elements from A2 to An, I can solve it, so A1 reduces the size of the problem and hands it over to A2. A2, A3,..they all did the same thing. finally when An-1 asks the same problem to An, An realizes that he knows the solution of sum of all elements from An onwards because there is no-one beyond An, so An just returns its value as the solution to An-1. during return, An-1, An-2, An-3...they all adds up their value to the solution and returns the final sum to the caller. This way, A will get the solution without hiring the contractor, because everyone did their own part of work in solving the problem.

Don't ever worry about writing the complete solution of the recursive problem. Don't be scared of it, Recursion in the simplest thing in the world. Here are the important steps for solving recursive problems.

1. Every recursive problem statement is a question asked to some object. Think about how to user of your application is going to use your function i.e
height(root) ==> what is the height ? asked to the root pointer of the tree.
factorial(n) ==> what is the factorial ? asked to an Integer n.
preOrder(root) ==> what is the pre-order? asked to the root pointer of the tree.

2. If the object doesn't know the exact solution to the question it assumes that it knows someone who can solve the problem of size N-1 and then it just works on their result to build the final solution.

3. If the object knows the exact solution to the question asked, it simply returns the solution to the caller.


Example: Is there any path in a binary tree whose path sum is S. =>doesSumExist(S, root);

Step 1: question is "Is there any path you know whose sum is S?" object here is root.

Step 2: since root in itself doesn't know the answer and root->left and root->right are the only guys it know of, so it announces that "If root->left and root->right both can answer my question "Is there any path you know whose sum is (S - value)" then I can solve the actual problem which was asked to me".

Step 3: when the question percolated down to leaf nodes of the tree, they knew the answer so they responded in "yes" or "no".

so code goes like..

bool doesSumExist(int S, Node *root)
{
if(root==NULL)
return FALSE;
if(root->left == NULL && root->right == NULL)
return S==root->value;
else
return doesSumExist(S -- root->value, root->left) ||
doesSumExist(S - root->value, root->right);
}

Conclusion : Try to be as lazy as possible when you take a recursive approach and rely completely on your neighbors for the solution. If you know the solution just return it otherwise just perform the additional calculation on top of what your neighbor returns in order to solve the problem.

Fixing the boot issue

In Redhat Linux, during grub display, just type c for command-line

option of GRUB: To boot Linux do this:
grub> help

grub> root
(hd1,1): Filesystem is type ext2fs, partition type 0x83
grub> root (hd1,0)

grub> kernel /boot/vmlinuz

grub> boot

To boot MS Windows 95/2000 etc do this: If you want to boot an
unsupported operating system (e.g. Windows 95), chain-load a boot
loader for the operating system. Normally, the boot loader is embedded
in the boot sector of the partition on which the operating system is

grub> help
grub> help rootnoverify
grub> rootnoverify (hd0,0)
grub> makeactive
grub> chainloader +1
grub> boot

Compiling the Linux Kernel

Steps to follow:

  1. Unpack the sources

  2. Optional - Copy config file : You can copy the config file from your old linux kernel source tree to new kernel tree (may save time, if you want to reuse the old settings).

  3. make clean; make mrproper

  4. make xconfig

  5. make dep

  6. Give a unique name to your new Kernel - Edit /usr/src/linux/Makefile and change EXTRAVERSION

  7. nohup make bzImage

  8. 'make modules' and 'make modules_install'

  9. And you can go to lunch or go to bed (have nice Linux dreams in sleep) and when you come back the system is ready! And see the log with 'less nohup.out'.

  10. make install [num ] But NOT recommended - use cp /usr/src/linux/arch/i386/boot/bzImage /boot/bzImage.myker

  11. Configure GRUB or LILO.

  12. Reboot and check new kernel is booting

  13. Create emergency boot disk - bzdisk or mkbootdisk

  14. Optional - make rpm [num ] To build rpm packages

  15. Optional - make clean (If you want to free up disk space)

Tuesday, February 1, 2011

Ten Commands Every Linux Developer Should Know

This article presents a list of commands you should be able to find on any Linux installation. These are tools to help you improve your code and be more productive. The list comes from my own experience as a programmer and includes tools I've come to rely on repeatedly. Some tools help create code, some help debug code and some help reverse engineer code that's been dumped in your lap.

1. ctags

Those of you addicted to integrated development environments (IDEs) probably never heard of this tool, or if you did you probably think it's obsolete. But a tags-aware editor is a productive programming tool.

Tagging your code allows editors like vi and Emacs to treat your code like hypertext (Figure 1). Each object in your code becomes hyperlinked to its definition. For example, if you are browsing code in vi and want to know where the variable foo was defined, type :ta foo. If your cursor is pointing to the variable, simply use Ctrl-right bracket.

Figure 1. gvim at Work with Tags

The good news for the vi-impaired is ctags is not only for C and vi anymore. The GNU version of ctags produces tags that can be used with Emacs and many other editors that recognize tag files. In addition, ctags recognizes many languages other than C and C++, including Perl and Python, and even hardware design languages, such as Verilog. It even can produce a human-readable cross-reference that can be useful for understanding code and performing metrics. Even if you're not interested in using ctags in your editor, you might want to check out the human-readable cross-reference by typing ctags -x *.c*.

What I like about this tool is that you get useful information whether you input one file or one hundred files, unlike many IDEs that aren't useful unless they can see your entire application. It's not a program checker, so garbage in, garbage out (GIGO) rules apply.

2. strace

strace lets you decipher what's going on when you have no debugger nor the source code. One of my pet peeves is a program that doesn't start and doesn't tell you why. Perhaps a required file is missing or has the wrong permissions. strace can tell you what the program is doing right up to the point where it exits. It can tell you what system calls the program is using and whether they pass or fail. It even can follow forks.

strace often gives me answers much more quickly than a debugger, especially if the code is unfamiliar. On occasion, I have to debug code on a live system with no debugger. A quick run with strace sometimes can avoid patching the system or littering my code with printfs. Here is a trivial example of me as an unprivileged user trying to delete a protected file:

strace -o strace.out rm -f /etc/yp.conf

The output shows where things went wrong:

lstat64("/etc/yp.conf", {st_mode=S_IFREG|0644,
st_size=361, ...}) = 0
access("/etc/yp.conf", W_OK) = -1 EACCES
(Permission denied)
unlink("/etc/yp.conf") = -1 EACCES (Permission
denied)

strace also lets you attach to processes for just-in-time debugging. Suppose a process seems to be spending a lot of time doing nothing. A quick way to find out what is going on is to type strace -c -p mypid. After a second or two, press Ctrl-C and you might see a dump something like this:

% time    seconds  usecs/call     calls    errors  syscall
------ ----------- ----------- --------- --------- ----------------
91.31 0.480456 3457 139 poll
6.66 0.035025 361 97 write
0.91 0.004794 16 304 futex
0.52 0.002741 14 203 read
0.31 0.001652 3 533 gettimeofday
0.26 0.001361 4 374 ioctl
0.01 0.000075 8 10 brk
0.01 0.000064 64 1 clone
0.00 0.000026 26 1 stat64
0.00 0.000007 7 1 uname
0.00 0.000005 5 1 sched_get_priority_max
0.00 0.000002 2 1 sched_get_priority_min
------ ----------- ----------- --------- --------- ----------------
100.00 0.526208 1665 total

In this case, it's spending most of its time in the poll system call—probably waiting on a socket.

3. fuser

The name is a mnemonic for file user and tells what processes have opened a given file. It also can send a signal to all those processes for you. Suppose you want to delete a file but can't because some program has it open and won't close it. Instead of rebooting, type fuser -k myfile. This sends a SIGTERM to every process that has myfile opened.

Perhaps you need to kill a process that forked itself all over the place, intentionally or otherwise. An unenlightened programmer might type something like ps | grep myprogram. This inevitably would be followed by several cut-and-paste operations with the mouse. An easier way is to type fuser -k ./myprogram, where myprogram is the pathname of the executable. fuser typically is located in /sbin, which generally is reserved for system administrative tools. You can add /usr/sbin and /sbin to the end of your $PATH.

4. ps

ps is used to find process status, but many people don't realize it also can be a powerful debugging tool. To get at these features, use the -o option, which lets you access many details of your processes, including CPU usage, virtual memory usage, current state and much more. Many of these options are defined in the POSIX standard, so they work across platforms.

To look at your running commands by pid and process state, type ps -e -o pid,state,cmd. The output looks like this:

4576 S /opt/OpenOffice.org1.1.0/program/soffice.bin -writer
4618 D dd if /dev/cdrom of /dev/null
4619 S bash
4645 R ps -e -o pid,state,cmd

Here you can see my dd command is in an uninterruptible sleep (state D). Basically, it is blocking while waiting for /dev/cdrom. My OpenOffice.org writer is sleeping (state S) while I type my example, and my ps command is running (state R).

For an idea of how a running program is performing, type:

ps -o start,time,etime -p mypid

This shows the basic output from the time command, discussed later, except you don't have to wait until your program is finished.

Most of the information that ps produces is available from the /proc filesystem, but if you are writing a script, using ps is more portable. You never know when a minor kernel rev will break all of your scripts that are mining the /proc filesystem. Use ps instead.

5. time

The time command is useful for understanding your code's performance. The most basic output consists of real, user and system time. Intuitively, real time is the amount of time between when the code started and when it exited. User time and system time are the amount of time spent executing application code versus kernel code, respectively.

Two flavors of the time command are available. The shell has a built-in version that tells you only scheduler information. A version in /usr/bin includes more information and allows you to format the output. You easily can override the built-in time command by preceding it with a backslash, as in the examples that follow.

A basic knowledge of the Linux scheduler is helpful in interpreting the output, but this tool also is helpful for learning how the scheduler works. For example, the real time of a process typically is larger than the sum of the user and system time. Time spent blocking in a system call does not count against the process, because the scheduler is free to schedule other processes during this time. The following sleep command takes one second to execute but takes no measurable system or user time:

\time -p sleep 1
real 1.03
user 0.00
sys 0.00

The next example shows how a task can spend all of its time in user space. Here, Perl calls the log() function in a loop, which requires nothing from the kernel:

\time perl -e 'log(2.0) foreach(0..0x100000)'
real 0.40
user 0.20
sys 0.00

This example shows a process using a lot of memory:

\time perl -e '$x = 'a' x 0x1000000'

0.06user 0.12system 0:00.22elapsed 81%CPU
(0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (309major+8235minor)pagefaults
0swaps

The useful information here is listed as pagefaults. Although the GNU time command advertises a lot of information, the 2.4 series of the Linux kernel stores only major and minor page-fault information. A major page fault is one that requires I/O; a minor page fault does not.

6. nm

This command allows you to retrieve information on symbol names inside an object file or executable file. By default, the output gives you a symbol name and its virtual address. What good is that? Suppose you are compiling code and the compiler complains that you have an unresolved symbol _foo. You search all of your source code and cannot find anywhere where you use this symbol. Perhaps it got pulled in from some template or a macro buried in one of the dozens of include files that compiled along with your code. The command:

nm -guA *.o | grep foo

shows all the modules that refer to foo. If you want to find out what library defines foo, simply use:

nm -gA /usr/lib/* | grep foo

The nm command also understands how to demangle C++ names, which can be handy when mixing C and C++. For example, forgetting to declare a C function with extern"C" produces a link time error something like this:

undefined reference to `cfunc(char*)'

In a large project with poorly defined headers, you might have a hard time tracking down the offending module. In this case, you can look for all the unresolved symbols in each object file with demangling turned on as follows:

nm -guC *.o
extern-c.o:cfunc
no-extern-c.o:cfunc(char*)

The first module is correct; the second is not.

7. strings

This command looks for ASCII strings embedded in binary files. It can be used for good or for evil. The good uses include trying to figure out what library is producing that cryptic string on stdout every once in a while, for example:

strings -f /usr/lib/lib* | grep "cryptic message"

On the evil side, the character strings can be used to probe your format strings looking for clues and vulnerabilities. This is why you should never put passwords and logins in your programs. It might be wise to examine your own programs with this tool and see what a clever programmer can see. The version of strings that comes with the GNU binutils has many useful options.

8. od, xxd

These two commands do basically the same thing, but each offers slightly different features. od is used to convert a binary file to whatever format you like. When dealing with programs that generate raw binary files, od can be indispensable. Although the name stands for octal dump, it can dump data in decimal and hexadecimal as well. od dumps integers, IEEE floats or plain bytes. When looking at multibyte integers or floats, the host byte order affects the output.

xxd also dumps binary files but does not try to interpret them as integers or floats, so the host byte order does not affect the output, which can be confusing or helpful depending on the file. Let's create a four-byte file on an Intel machine:

$ echo -n abcd > foo.bin
$ od -tx4 foo.bin
0000000 64636261
0000004

$ xxd -g4 foo.bin
0000000: 61626364 abcd

The output of od is a byte-swapped 32-bit integer, and the output of xxd is a group of four bytes in the same byte order as they appear in the file. If you're looking for the string abcd, xxd is the command for you. But, if you're looking for the 32-bit number 0x64636261, od is the right command.

xxd also knows a few cool tricks that od doesn't, including the ability to format the output in binary and to translate a binary file into a C array. Suppose you have a binary file that you want to encode inside an array in your C program. One way to do this is by creating a text file as follows:

$ xxd -i foo.bin

unsigned char foo_bin[] = {
0x61, 0x62, 0x63, 0x64
};

unsigned int foo_bin_len = 4;
9. file

UNIX and Linux have never enforced any policy of filename extensions. Naming conventions have evolved, but they are guidelines, not policies. If you want to name your digital picture image00.exe, go ahead. Your Linux photo application gladly accepts the file no matter what the name is, although it may be hard to remember.

The file command can help when you have to retrieve a file from a brain-dead Web browser, which mangles the name—say a file that should have been named foo.bar.hello.world.tar.gz comes out as foo.bar. The file command can help like this:

$ file foo.bar

foo.bar: gzip compressed data,
was "foo.bar.hello.world.tar", from Unix

Perhaps you received a distribution with a bin directory full of dozens of files, some of which are executables and some are scripts. Suppose you want to pick out all the shell scripts. Try this:

$ file /usr/sbin/*  | grep script

/usr/sbin/makewhatis: a /bin/bash script text
executable

/usr/sbin/xconv.pl: a /usr/bin/perl script
text executable

The file command identifies all the files in the bin directory, and the grep command filters out everything not a script. Here are some more examples:

file core.4867

core.4867: ELF 32-bit LSB core file Intel 80386,
version 1 (SYSV), SVR4-style, from 'abort'

file /boot/initrd-2.4.20-6.img

/boot/initrd-2.4.20-6.img: gzip compressed data,
from Unix, max compression

file -z /boot/initrd-2.4.20-6.img

/boot/initrd-2.4.20-6.img: Linux rev 1.0 ext2
filesystem data (gzip compressed data, from Unix,
max compression)

Just as you shouldn't judge a book by its cover, you shouldn't assume the contents of a file based on its name.

10. objdump

This is a more advanced tool and is not for the faint of heart. It's sort of a data-mining tool for object files. A treasure trove of information is encoded inside your object code, and this tool lets you see it. One useful thing this tool can do is dump assembly code mixed with source lines, something gcc -S doesn't do for some reason. Your object code must be compiled with debug (-g) for this to work:

objdump --demangle --source myobject.o

objdump also can help extract binary data from a core file for postmortem debug when you don't have access to a debugger. A complete example is too long for this article, but you need the virtual address from nm or obdump -t. Then, you can dump the file offsets for each virtual address with objdump -x. Finally, objdump is able to read from non-ELF file formats that gdb and other tools can't touch.

How many are speaking the Truth?

There are 100 persons present in a room. Each of them are making one statement one-after-the-other. Here are their statements in sequence..

1. There is no one in this room who speaks the truth.
2. At most 1 person in the room speaks the truth.
3. At most 2 person in the room speaks the truth.
4. At most 3 person in the room speaks the truth.
..
..
100. At most 99 person in the room speaks the truth.

Based on their statements, can you find out the number of truth tellers present in the room?

Arrange Numbers on Sides of a Traingle

Can you arrange 1-9 number on the sides of a triangle so that each side contains 4 numbers and sum of numbers on all the three sides are equal? If yes, how many such arrangements are possible?