Thursday, December 24, 2009

Warning: the `gets' function is dangerous and should not be used.

Strange title? Many novice programmers encounter this peculiar statement when they compile their very first string based C programs. Surprisingly, many just decide to ignore it and not a lot of stress is put on students in their programming career about the seriousness of that simple warning.

The basic definition of 'gets' from Wikipedia:

"gets is a function in the C standard library, declared in the header file stdio.h, that reads a line from the standard input and stores it in a buffer provided by the caller.

Use of gets is strongly discouraged. It is left in the C89 and C99 standards for backward compatibility (but officially deprecated in late revisions of C99). Many development tools such as GNU ld emit warnings when code using gets is linked. The programmer must know a maximum limit for the number of characters gets will read so he can ensure the buffer is big enough."

The last line is something that I would like to stress on. Basically a buffer in C is a block of memory allocated for any arbitrary use such as storing a string of characters. A buffer is also limited on size as defined by the programmer. Now imagine a programmer allocates 10 characters in a buffer so our buffer would look like:

size = 10 <====10====>
buffer => [_|_|_|_|_|_|_|_|_|_]Adjacent Memory=>

Now our buffer needs to hold some data like a string of characters. Here is where 'gets' comes in. gets fetches an input from the user and stores it in the buffer. But here is where the problem also comes in. 'gets' job is to get the data and store it in the buffer and not to check how big the data is. Therefore if someone entered a string of 11 characters then we would have a infamous situation called a "Buffer Overflow".

After entering: "Hello World" => 11 characters including the space, the adjacent memory is overwritten.

size = 10 <====10====>
buffer => [H|e|l|l|o|_|W|o|r|l]ddjacent Memory=>
Memory Overwritten----------^

A buffer overflow results in the extra data being written in the adjacent areas of the memory. Now this may cause a problem like a crash but also poses a security threat as malicious users can utilize this flaw to write certain data of their choice to make the application behave in their chosen way. Thus that is why one should avoid the use of 'gets' in their programming and try for more safe operations like 'fgets' which checks the size of data before putting it into the buffer.

Have a safe Merry Christmas!!

No comments: