Archive

Author Archive

Lenovo ideapad Z570 review

July 15, 2011 Leave a comment

Hai everyone,

I am writing this review as  how i feel about this laptop since i am using the same laptop.

The lenovo ideapad Z570 comes into market with the following configurations:

2nd gen Intel core i3/i5

4gb/3gb RAM

640/750 Gb HDD

Intel dedicated graphics/Nvidia getforce 1Gb graphic card

win 7 home basic/home premium

15.6 HD LED glare

Card reader,numpad

3 usb slots.

Lenovo ideapad Z70 models have a thermal management button on their lap which enables efficient heat management of laptop depending upoun the surrounding temperature.

Another speaciality is thet this laps are having round keys which are more comfortable to work with and also their is a gap of 0.5 mm in between the keys which also enables easy use.The sound is good and the video clarity is also good.The screen is about 0.5cm thick.

overall look of the laptop is good.There is also a seperate numpad which adds to the easy use of laptops.

There is no choice on the basis of colour .Only one colour is available(Metallic grey with a pinkish touch).

Overall perfomance is good.The battery gives a backup for 5hrs when fully charged.

Rating:9/10.

Categories: GENERAL

The Open Source Office Software Sector Heats Up

July 2, 2011 Leave a comment
LibreOffice

The world of LibreOffice and OpenOffice(.org) has been heating up recently with several exciting and, at times, bewildering developments. The Document Foundation remains very active as is LibreOffice development, but Oracle has given up on OpenOffice and slapped LibreOffice in the face by giving it to Apache. Perhaps the most important announcement was the release of LibreOffice 3.4.0.

The recent release of LibreOffice 3.4 demonstrates the very philosophical differences in community projects and those stifled by commercial interests. LibreOffice development has been happening at an unprecedented pace while OpenOffice lagged behind and lost many of its previous users. Even under Sun development was tightly controlled, but Oracle increased the bonds. In contrast, according to the release announcement, LibreOffice now has 120 happy developers committing approximately 20 changes per day. Cédric Bosdonnat puts the number of contributors at 223. Italo Vignoli is quoted as saying, “We care for our developers, and it shows.”

Just before LibreOffice 3.4 was released Oracle announced that it was donating OpenOffice to the Apache Software Foundation. Pundits have speculated all around the spectrum of how that will affect the office suite with some thinking it will certainly benefit while others think it will most likely wane even further. The Document Foundation expressed disappointment that a reunification of the two projects will probably not occur but offered their best wishes for OpenOffice. They were upbeat about including OpenOffice code since the Apache license is compatible with the GNU Lesser General Public License under which LibreOffice is released. Given these facts, “the event is neutral for The Document Foundation.”

What’s New in LibreOffice 3.4?

Most folks just want to hear of the pretty and handy features visible in their daily work, but underestimating the impact of code clean-up is a disservice to developers. These code clean-ups are what leads to faster operation and fewer crashes. Michael Meeks calls this “ridding ourself of sillies.” One area in which these two world merge comes in an example given by Meeks: icons. He said, “OO.o had simply tons of duplication, of icons everywhere” – approximately 150 duplicated missing icons. He added, “All of that nonsense is now gone.” A font memory leak has been fixed and rarely used encodings have been moved out into their own library. This “reduces run-time memory consumption” and shrinks download size.


New gradient page borderWriter has gotten some eye candy like gradient color backgrounds, drop shadows, and colored footnote separators to spruce up the appearance. A new font engine makes text prettier and faster. Flat ODF filters can make .odf files more accessible to other applications.

For the Ubuntu user, Unity global menu support has been added and improved GTK+ integration gives LibreOffice a native look. Better mouse theme support adds a little more polish. Encrypted document passwords can now be changed while the document is still open.


Impress got improved HTML with image thumbnail gallery exportThe full list of changes and most annoying bugs is located on the wiki.

Version 3.4.1 is scheduled for release on Jun 29, 2011, with patch version releases coming pretty much monthly afterwards. 3.5.0 is expected on Feb 8, 2012 and 3.6.0 Aug 1, 2012. Release dates are scheduled every six months and largely synchronized with popular Linux distribution releases. Releases are supported for one year.

What Else?

Michael Meeks published a “Why LibreOffice is the Future” article back in May. In it he enumerated many reasons LibreOffice is a better choice than OpenOffice or others. He posits that LibreOffice is vendor neutral, whether your talking about Red Hat, Novell, or Canonical. He also thinks LibreOffice isn’t vulnerable to contributors leaving because it’s a community project with lots of other participants. Another point is that “Linux distributions are safer with LibreOffice” because of the new time-based released schedule and stability from a diverse contributor base. See his full article for more.

According to Steven J. Vaughan-Nichols Attachmate and Novell will continue to support LibreOffice. In addition, in response to the Oracle OpenOffice contribution, Holger Dyroff, Vice President of Business Development, SUSE said, “SUSE is continuing to invest in LibreOffice and The Document Foundation. SUSE is looking forward to the future contributions of IBM and potentially others into this new ASF incubator project, but would certainly have liked to see such contributions go directly to LibreOffice. We will follow the incubation process very closely to understand future opportunities and possibilities which can improve our offerings for our users and customers.”

If you been following the development of LibreOffice through their announce mailing list, that’s over now. The foundation has decided to only announce stable / final releases on that general list from now on. Developmental releases will be announced on a few lists used primarily by developers. They said this is to avoid having users trying to use developmental releases in production environments. It’s doubtful very few actually risked critical work environments and trying to limit the use of developmental releases could possibly result in fewer bug reports. Time will tell.

Conclusions

Linux and Open Source software is rarely boring and the office suite sector has certainly offered its share of drama over the last year or so. And it hasn’t let up yet. Many pundits think Oracle’s contribution of OpenOffice to Apache will certainly benefit users. Apache is a well respected organization and with IBM and some distributions expected to contribute, many think OpenOffice will likely see continued development – giving users an option that was largely considered gone a few months ago. (This is assuming Apache officially accepts the project as predicted.)

On the other hand, LibreOffice has taken off like wildfire. Developers and contributors continue to flock to LibreOffice’s corner and distributions are switching left and right. Under the community-contribution model, new features and code improvements are being integrated at an amazingly rapid pace. Again, this is all the better for users.

So, just about anyway you look at, the ordinary everyday Linux user is the benefactor of all this code shuffling. It’s exciting to watch as well. The next few months will be especially interesting as we begin to see how Apache Office progresses and if the Apache license will end up attracting more developers than LibreOffice’s LGPL.

SOURCE:LINUX JOURNAL

Categories: GENERAL

FOSS is Fun: A Testing Time

July 2, 2011 Leave a comment

By : Kenneth Gonsalves

First, the sales team meets the client, and promises him anything whatsoever—as long as they get the order. Then the payment schedule is set up, and a hefty advance is taken. Next, the folks in design move in—they may or may not meet the client, but they produce a huge number of specs, graphs, diagrams, etc. Again, the client may or may not get to see them. Then, it’s over to ‘Production’. Here, the application is chopped up into bits, and handed over to various teams—in such a way that no team gets an overview of the whole application.

Soon, the coding starts. Each coder is given a little bit to do, while all coders are given commit rights. Since the left hand does not know what the right hand is doing, a lot of the code is duplicated—or the same functionality is implemented in several different ways, in the same application. At this stage, no attempt is made to check if the various parts work together. Peer review—that is, one coder criticising the work of another—is never done; it is seen as an insult. Then the pieces of code are all sent to an ‘integrator’, who is the most highly paid person in the project. He performs some magic to make everything work together.

The application then goes to the testing group. Testing is something everyone hates to do. If those in this department find bugs and flaws, it goes back to the integrator to be resolved. Finally, the application goes to the service and support team, who installs it for the customer. This is perhaps six months to a year after the order was placed. The customer’s business needs would have changed by then, but since the money has been paid, the client silently accepts what is delivered. The customer then does some testing (a task usually assigned to the vendor’s support team, who hasn’t a clue as to what is happening, as the production team that originally worked on the application, has moved on, elsewhere). So the customer either accepts and makes the best of what’s been delivered or ends up spending more money to fix/modify things that should not have been broken in the first place. Although this sounds crazy, apparently this is how the software industry works.

Open source development follows a totally different path. For one, the customer is pulled in as a contributor to development. The application runs from, practically, day one; the customer sees the application in all stages of development, and is encouraged to play with it by entering real data. The developers see the whole application at all times, and hence code is not duplicated, nor is the same thing done differently in different parts. Peer review of code is encouraged, everyone improves, no one takes offence. So when the customers actually get the application, they are familiar with it, and find that they’ve got what they wanted. And it works.

So what about testing? Testing is the fun part of development—and the developers write the tests. The mantra is: “First write your tests, then write the code. As you keep writing the code, run your tests. When all tests pass, stop writing code.” Of course, it does not work in such a linear fashion. One usually writes some proof-of-concept code at first, then tests it; this is followed by more code, then more tests. When the developer is also the tester, bugs and problems tend to vanish. And when the customer is playing with the application using real data, functionality is also ensured.

Of course, to really understand and use this model, one has to view an application as an incrementally growing and evolving process—not something to be ‘produced’ and delivered when complete. Software is never final or complete

SOURCE:LINUX FOR YOU

Categories: GENERAL

Ubuntu 11.04, Unity Released to Mixed Reactions

May 27, 2011 Leave a comment
Ubuntu

Ubuntu 11.04 was released on April 28 with a brand new interface and a couple default application changes. But all the talk is about Unity, that brand new interface. As one might predict, reactions are all over the spectrum.

The Unity interface has taken design cues from popular mobile systems with the focus being on saving screen space and making everything readily accessible from within that limited space. It appears designers were shooting for easy and beautiful, but some users are finding adjustment during these early days a bit challenging.

Unity consists of several significant changes to the traditional desktop layout. Unity consists of three main parts: Dash, Launcher, and Top Panel. Dash has replaced the traditional menu system with a window of icons that launch applications or places. The Launcher is the dock-looking element on the left side of the screen where running apps are represented. The Top Panel is the home of some applet indicators but its main function is work as the focused application’s menu or main toolbar.

These Mac-like elements are causing some controversy. Some really like the new desktop while others find it very awkward and yet others are neither impressed or put off. There have been dozens of postings about Ubuntu’s new Unity and they’ve been all over the map.

For example:

Ivor O’Connor said, “Ubuntu seems to be run by kiddies more interested in blinding you with eye-candy than allowing you to be productive.”

Ethan C. Nobles said, “Unity is, in essence, a strip of icons that sits mockingly on the left side of the screen and makes running and switching between applications very clumsy. It’s buggy, too.”

“I find Unity to be suffocating and unnecessary. For me it adds little value and seems to be in the way most of the time; so I would definitely not use Ubuntu 11.04 as one of my regular distros. I tried to like it but I just couldn’t,” said Jim Lynch.

Of course the reviews aren’t all bad:

“I have to say that a few months of using Unity leaves me loving it. There’s no desktop out there – not Windows, KDE or even OS X – that feels this well integrated and consistent.” That is from Justin Pot.

A blogger on identified as Zenobia said, “Unity was a like a breeze of fresh air. I was quite excited with the changes. I love the dash in Unity.”

“I like the changes a lot, because the desktop environment gets out of the way when I am using an application, but the launcher and application chooser is there if and when I want them,” said Zeth.

Then you have those in middle of the road:

“After a bit of work, I’m enjoying my new Ubuntu with Unity. I don’t think it’s better than the previous Ubuntu, but it looks nice; it’s visually appealing and fast. But in my opinion, not as easy to use for those familiar with Ubuntu/Linux.” This was posted on utherpendragonfly.wordpress.com.

“This is not a disaster like the KDE 4 release was. Ubuntu 11.04 is really the culmination of what Canonical have been doing for the past 6 (or so) years: it’s generally slick, it makes bold and well thought out choices, and it doesn’t get in your way,” was found on flavor8.com.

Rob Williams said, “Unity impressed me a lot more than I expected it to. After some use, that all becomes easier to get used to, but I don’t think it’ll ever feel like it’s the “best” way to do things. The simple fact is that it’ll require more steps than what we’re used to.”

*****One thing to note about most of the reviews is that few were entirely negative or positive. Most mention some good things and bad things. Again, like the thesis of this article, feelings were mixed. Another noticeable trend is that there were more negative than positive posts, but that’s probably to be expected given human nature.

More evidence of this can be found in a recent poll at tuxmachines.org. Never has a poll been so closely voted:

How’s Ubuntu 11.04? Percentage Great! 14%Good 13%Okay 13% Not So Good 15%Awful! 11%Who Cares? 35%Total Votes679

Work-flow isn’t the only consideration. There have been significant bugs reported as well. The most prominent was the installer partition selection bug. This prevented those with partitioned drives to choose which partition to install upon.

This release may have been a real departure for Ubuntu and its developers, but users are not all universally pleased. Some are and some aren’t. So, if you were waiting for the reviews to help you decide, you’re out of luck. This is one you’ll have to test and decide for yourself.

SOURCE:LINUX JOURNAL

Categories: GENERAL

Ebook Publishing Using Linux Tools

May 27, 2011 Leave a comment

<!– how’d this get in here?

–>

Ebook

Digital books, aka “ebooks” are going to change the publishing world just as iTunes and digital music have changed the music industry.  At the moment, Amazon’s Kindle seems to be the biggest fish in the pond.  While the actual numbers are fuzzy, Amazon’s Kindle appears to be driving ebook growth, as suggested by this article.

Recent news points to authors making a dramatic shift from traditional publishing houses to self-publishing, as pointed out in this article that describes why Barry Eisler turned down a $500,000 deal from a mainstream publisher, choosing instead to self-publish.  This particular article was in fact my own tipping point: I had written a science fiction novel 30 years ago that I was unable to get any of the publishing houses interested in at the time.  I thought to myself, “Why not?”  So I dusted off the old digital manuscript, completely rewrote the story, and recently published it on Amazon’s Kindle publishing site.  BTW, for more info on how the original digital manuscript migrated from machine to machine over that thirty year period, see the Author’s Note on the home page for my novel, Second Cousins.

As a long-time hard-core Linux user, I thought that some of you other Linux folks might be interested in how to write and publish a Kindle ebook using only Linux tools.  Before I give the the list of required software for publishing a Kindle ebook using Linux (it’s a short list), I want to point out there there isn’t any good single “Howto” guide that I’ve been able to find that describes the best way publish a Kindle ebook.  There are a whole bunch of references that describe part of the process, like this one for example that describes how to create an NCX file that will enable live table of contents navigation on the Kindle.  I spent a lot of time diddling with XML and OPF (Open Packaging Format) files before deciding that this was not the way to go.

Likewise, some of the Kindle HowTo references out there suggest writing your book using an html editor, defining bookmarks and tags to specify the table of contents, cover, and start page  in such a way that the Kindle device will recognize them. Again, wrong approach, IMO.  When I write, I want to focus on the story, not the software.

Then I found The Answer: this.  An OpenOffice template specifically designed to support publishing Kindle ebooks.  And that, ladies and gentlemen, is the only software you need to publish a Kindle ebook on Amazon.  If you follow the simple Youtube video instructions for using this template, you can directly upload the .doc file generated by the template to the Amazon Kindle publishing site.  No muss, no fuss.  This file contains all the tags and bookmarks necessary for a Kindle device or one of the free Kindle reading apps to be able to render the cover, table of contents, and book contents correctly.  Further, the OpenOffice Kindle template formats your text such that you see your book as it will appear when viewed on the Kindle.

Considering the amount of time I save by using this template, the nominal fee charged by its developer is well worth it for me. However, if you really do want to write your book in html, and create the ancillary NCX and OPF files, you can do this and then create an uploadable Kindle ebook file using the free Kindlegen app from Amazon.  But seriously, why would you want to?

SOURCE:LINUX JOURNAL

Categories: GENERAL

FOSS

May 27, 2011 Leave a comment

Free and open-source software (F/OSS, FOSS) or free/libre/open-source software (FLOSS) is liberally licensed to grant the right of users to use, study, change, and improve its design through the availability of its source code. This approach has gained both momentum and acceptance as the potential benefits have been increasingly recognized by both individuals and corporations.

In the context of free and open-source software, free refers to the freedom to copy and re-use the software, rather than to the price of the software. The Free Software Foundation, an organization that advocates the free software model, suggests that, to understand the concept, one should “think of free as in free speech, not as in free beer”.

FOSS is an inclusive term that covers both free software and open source software, which despite describing similar development models, have differing cultures and philosophies.Free software focuses on the philosophical freedoms it gives to users, whereas open source software focuses on the perceived strengths of its peer-to-peer development model. FOSS is a term that can be used without particular bias towards either political approach.

Free software licences and open source licenses are used by many software packages. While the licenses themselves are in most cases the same, the two terms grew out of different philosophies and are often used to signify different distribution methodologies.

SOURCE:WIKIPEDIA

Categories: GENERAL

Talking Point: Overlapping Windows

May 23, 2011 Leave a comment

Back in the 80s, a GUI paradigm called WIMP (Windows, Icons, Mouse, Pointer) began to establish itself as the new way in which most people interacted with computers. When it comes to one of the most significant elements of that system, overlapping windows, I’m beginning to wonder, has it had its day?

One of few things that Microsoft can claim to have developed from scratch is an efficient method of application switching called the taskbar, although it’s now in the process of being superseded on most GUIs by the application dock. One side-effect of that form of program management is that it doesn’t penalize the user for running applications fullscreen, and it therefore encourages it. You can glean some ideas about modern user behavior by observing that, in the most popular WM themes and skins, the areas of the window that are used for resizing have almost disappeared. The truth is, if you use Gnome or KDE, you probably run most of your apps fullscreen, most of the time.

In the future, I think that overlapping windows will be seen as a power user’s feature, rather like the command line. The non-expert computer user has little use for windows that don’t encompass the entire screen, and novice users find resizable, overlapping windows confusing. There are some operations, such as dragging and dropping of file icons, that benefit from overlapping windows, but again, this is a feature that is mostly used by experts.

PDAs and other small computers have long pioneered the techniques needed to make multiple running programs individually accessible. Running everything fullscreen on a full-sized device does, however, present a few drawbacks. For one thing, text can be difficult to read when spread out over large areas on modern widescreen monitors. Personally, I wouldn’t fancy word processing on a 24” widescreen monitor with the main window maximized. I think that multi-column websites give us some clues as to what a desktop of the future might look like.

There are probably two solutions that we are going to see dominate over the next few years.

Firstly, tiled window management, of a sort that has existed for many years on Linux, may finally break through to the mainstream. Tiling has the advantage that it does away with the complexities and inefficiencies of overlapping windows while still allowing the user to view more than one window at once. It’s worth noting that KDE SC 4.5 introduced tiling support.

The wmii window manager. Could this be a glimpse into the future?

Secondly, it’s possible that applications will begin to make use of more panes within a main window. For example, on a widescreen monitor, it’s quite convenient to leave the Firefox sidebar open at all times. I wonder if other subwindows could be enabled by default, perhaps piping in pertinent information? Some tiled window managers can simulate this approach, to an extent, by allowing you to associate certain applications together into groups.

Back in the mid-90s, Apple and IBM collaborated on an application framework called OpenDoc. The idea behind OpenDoc was that application components could be freely embedded into host applications. So, for example, if you clicked on a image in your word processor, a toolbar might appear around it, courtesy of Adobe Photoshop. Although the tech did appear in the form of some proof of concept applications that shipped with OS/2, it was ultimately abandoned. However, another framework such as that one could solve some of the problems of efficient use of screen resources in an intuitive manner without resorting to traditional overlapping windows.

LyX 2.0 is an application that can pack quite a lot into its main window. Perhaps this will be the norm in the future?

Of the tiled WMs I’ve seen, none of them seem to be very easy for the newcomer to use. So the question is, is anyone actually using these things on a day to day basis?

SOURCE:LINUX JOURNAL

Categories: GENERAL

Spotlight on Linux: Toorox

May 23, 2011 Leave a comment
Toorox

Toorox is a Gentoo-based installable live CD that features your choice of KDE or GNOME desktops. It comes with lots of useful applications including system configuration tools, easy package management, and proprietary code installers.

Toorox is sometimes compared to another Gentoo-based distribution, Sabayon. This comparison may be legitimate on the surface, but differences emerge when looking deeper. Sabayon is indeed based on Gentoo as Toorox, but Sabayon is primarily a binary distribution. Package installation almost always involves installing binary Sabayon packages. While this is convenient and often preferred, Toorox compiles and install software from Gentoo sources. Toorox begins life on your computer as a binary installation with all its advantages, such as fast, easy, and ready at boot, but subsequent package installation compiles source packages. So Toorox is perfect for users that would like a source-based distribution, but don’t want the initial time and effort investment. Either over time or with a all-at-once effort, one can fairly easily transform Toorox to a full source install.

Toorox lists some of their software in an introduction that appears when the desktop starts. These include:

– Kernel 2.6.37-gentoo
– KDE 4.6.0
– Xorg-Server 1.9.4
– LibreOffice 3.3.1
– IceCat 3.6.13
– Thunderbird 3.1.7
– K3b 2.0.2
– Gimp 2.6.11
– Wine 1.3.14
– VLC 1.1.7
– Amarok 2.4.0
– Audacious 2.4.3
– Ardour 2.8.7
– Kino 1.3.3
– Cinelerra 20101104

Toorox includes two graphical Portage front-ends: Potato and Porthole. Of course, users can use Portage at the commandline just as in Gentoo. In any case, there’s plenty of software available to install.

In addition, users may wish to install NVIDIA or ATI proprietary drivers. In the Systemconfig are the utilities that will install those. Users may also install Flash and multimedia libraries with the provided scripts.


Toorox GNOME desktop toolsLike other Gentoo-based systems, Toorox suffered through growing pains and initial failings. But also like Sabayon, it’s shown great improvement over the years and now gives users a stable and enjoyable experience.

The hard drive install is a simple procedure, asking only a few questions. It does offer one bootloader option rarely seen. It offers the usual choices of installing on the MBR or root partition, but it also allows users to add Toorox to an existing bootloader list. To use that option, one merely ticks the partition that contains the bootloader menu.

Toorox routinely comes in KDE and GNOME versions for 32-bit or 64-bit systems. The basic look and feel have been updated a bit in the newest releases, but overall it still retains the Toorox personality. This is usually formed from a black to white gradient background embossed with the Toorox logo with dark panels and desktop widgets. The latest wallpaper feature a multicolor design surrounding the Toorox logo and the machine architecture. Stable version 2.2011 was released February 27 and developmental release 3.2011 was released March 30.


Toorox KDE desktopToorox is a great choice for those who wish a bit more control over their machine or would like an introduction to Gentoo with a little less pain. Some may say Toorox isn’t ideal for new users, but that depends on the user really. In between the vast work of Gentoo and the ease of Sabayon comes Toorox. Give it a try.

SOURCE:LINUX JOURNAL

Categories: GENERAL

Open Source Cloud Computing with Hadoop

May 23, 2011 Leave a comment

Have you ever wondered how Google, Facebook and other Internet giants process their massive workloads? Billions of requests are served every day by the biggest players on the Internet, resulting in background processing involving datasets in the petabyte scale. Of course they rely on Linux and cloud computing for obtaining the necessary scalability and performance. The flexibility of Linux combined with the seamless scalability of cloud environments provide the perfect framework for processing huge datasets, while eliminating the need for expensive infrastructure and custom proprietary software. Nowadays, Hadoop is one of the best choices in open source cloud computing, offering a platform for large scale data crunching.

Introduction

In this article we introduce and analyze the Hadoop project, which has been embraced by many commercial and scientific initiatives that need to process huge datasets. It provides a full platform for large-scale dataset processing in cloud environments, being easily scalable since it can be deployed on heterogeneous cluster infrastructure and regular hardware. As of April 2011, Amazon, AOL, Adobe, Ebay, Google, IBM, Twitter, Yahoo and several universities are listed as users in the project’s wiki. Being maintained by the Apache Foundation, Hadoop comprises a full suite for seamless distributed scalable computing on huge datasets. It provides base components on top of which new distributed computing sub projects can be implemented. Among its main components is an open source implementation of the MapReduce framework (for distributed data processing) together with a data storage solution composed by a distributed filesystem and a data warehouse.

The MapReduce Framework

The MapReduce framework was created and patented by Google in order to process their own page rank algorithm and other applications that support their search engine. The idea behind it was actually introduced many years ago by the first functional programming languages such as LISP, and basically consists of partitioning a large problem into several “smaller” problems that can be solved separately. The partitioning and finally the main problem’s result are computed by two functions: Map and Reduce. In terms of data processing, the Map function takes a large dataset and partitions it into several smaller intermediate datasets that can be processed in parallel by different nodes in a cluster. The reduce function then takes the separate results of each computation and aggregates them to form the final output. The power of MapReduce can be leveraged by different applications to perform operations such as sorting and statistical analysis on large datasets, which may be mapped into smaller partitions and processed in parallel.

Hdaoop MapReduce

Hadoop includes a Java implementation of the MapReduce framework, its underlying components and the necessary large scale data storage solutions. Although application programming is mostly done in Java, it provides APIs in different languages such as Ruby and Python, allowing developers to integrate Hadoop to diverse existing applications. It was first inspired by Google’s implementation of MapReduce and the GFS distributed filesystem, absorbing new features as the community proposed new specific sub projects and improvements. Currently, Yahoo is one of the main contributors to this project, making public the modifications carried out by their internal developers. The basis of Hadoop and its several sub projects is the Core, which provides components and interfaces for distributed I/O and filesystems. The Avro data serialization system is also an important building block, providing cross-language RPC and persistent data storage.

On top of the Core, there’s the actual implementation of MapReduce and its APIs, including the Hadoop Streaming, which allows flexible development of Map and Reduce functions in any desired language. A MapReduce cluster is composed by a master node and a cloud of several worker nodes. The nodes in this cluster may be any Java enabled platform, but large Hadoop installations are mostly run on Linux due to its flexibility, reliability and lower TCO. The master node manages the worker nodes, receiving jobs and distributing the workload across the nodes. In Hadoop terminology, the master node runs the JobTracker, responsible for handling incoming jobs and allocating nodes for performing separate tasks. Worker nodes run TaskTrackers, which offer virtual task slots that are allocated to specific map or reduce tasks depending on their access to the necessary input data and overall availability. Hadoop offers a web management interface, which allows administrators to obtain information on the status of jobs and individual nodes in the cloud. It also allows fast and easy scalability through the addition of cheap worker nodes without disrupting regular operations.

HDFS: A distributed filesystem

The main use of the MapReduce framework is in processing large volumes of data, and before any processing takes place it is necessary to first store this data in some volume accessible by the MapReduce cluster. However, it is impractical to store such large data sets on local filesystems, and much more impractical to synchronize the data across the worker nodes in the cluster. In order to address this issue, Hadoop also provides the Hadoop Distributed Filesystem (HDFS), which easily scales across the several nodes in a MapReduce cluster, leveraging the storage capacity of each node to provide storage volumes in the petabyte scale. It eliminates the need for expensive dedicated storage area network solutions while offering similar scalability and performance. HDFS runs on top of the Core and is perfectly integrated into the MapReduce APIs provided by Hadoop. It is also accessible via command line utilities and the Thrift API, which provides interfaces for various programming languages, such as Perl, C++, Python and Ruby. Furthermore, a FUSE (Filesystem in Userspace) driver can be used to mount HDFS as a standard filesystem.

In a typical HDFS+MapReduce cluster, the master node runs a NameNode, while the rest of the (worker) nodes run DataNodes. The NameNode manages HDFS volumes, being queried by clients to carry out standard filesystem operations such as add, copy, move or delete files. The DataNodes do the actual data storage, receiving commands from the NameNode and performing operations on locally stored data. In order to increase performance and optimize network communications, HDFS implements rack awareness capabilities. This feature enables the distributed filesystem and the MapReduce environment to determine which worker nodes are connected to the same switch (i.e. in the same rack), distributing data and allocating tasks in such a way that communication takes place between nodes in the same rack without overloading the network core. HDFS and MapReduce automatically manage which pieces of a given file are stored on each node, allocating nodes for processing these data accordingly. When the JobTracker receives a new job, it first queries the DataNodes of worker nodes in a same rack, allocating a task slot if the the node has the necessary data stored locally. If no available slots are found in the rack, the JobTracker then allocates the first free slot it finds.

Hive: A petabyte scale database

On top of the HDFS distributed filesystem, Hadoop implements Hive, a distributed data warehouse solution. Actually, Hive started as an internal project at Facebook and has now evolved into a fully blown project of its own, being maintained by the Apache foundation. It provides ETL (Extract, Transform and Load) features and QL, a query language similar to standard SQL. Hive queries are translated into MapReduce jobs run on table data stored on HDFS volumes. This allows Hive to process queries that involve huge datasets with performances comparable to MapReduce jobs while providing the same abstraction level of a database. Its performance is most apparent when running queries over large datasets that do not change frequently. For example, Facebook relies on Hive to store user data, run statistical analysis, process logs and generate reports.

Conclusion

We have briefly overviewed the main features and components of Hadoop. Leveraging the power of cloud computing, many large companies rely on this project to perform their day to day data processing. This is yet another example of open source software being used to build large scale scalable applications while keeping costs low. However, we have only scratched the surface of the fascinating infrastructure behind Hadoop and its many possible uses. In future articles we will see how to set up a basic Hadoop cluster and how to use it for interesting applications such as log parsing and statistical analysis.

Further Reading

If you are interested in learning more about Hadoop’s architecture, administration and application development these are the best places to start:

– Hadoop: The Definitive Guide, Tim White, O’Rilley/Yahoo Press, 2 edition, 2010
– Apache Hadoop Project homepage: http://hadoop.apache.org/

SOURCE:LINUX JORNAL

Categories: GENERAL

Write Your Next Program on Linux

May 23, 2011 Leave a comment

Through this article, I want to ask people to start programming on the GNU/Linux operating system (from here on, referred to as just ‘Linux’). Students who are just getting started in programming; educators who teach or have a role in teaching programming to new students; hobbyists who program on Windows—I’m asking all of you to please read on and give Linux a real good try for at least a week. If you agree that programming on Linux is indeed a better experience than your previous platform, then stay with it, and enjoy the freedom that the rest of us do!

Just to clear any misunderstandings, I am not aiming to get you to write code for the Linux kernel itself (though that could well follow as your Penguincomfort and programming proficiency grow). Instead, I’m talking about writing user-space programs— including the exercises, homework, and project work that most computer-science study courses include. Before we start, here’s a disclaimer: this article contains strong personal opinions and beliefs; I do not in any way intend to be offensive, but some of these ideas just might be worth a try—by you—to see if you feel the same way!

Attacking the mindset

It’s commonly believed that Linux is ‘tough’. Sure, it’s different from what people who’re used to Windows are accustomed to—but it’s not tough. Once you adjust to the differences, you’ll probably laugh at this misconception yourself, and tell others how wrong their perception is!
Just consider the many computer science students who’ve been inspired by the buzz that Linux has been creating over a long time now. They have resolutely set about learning how to use it on their own initiative—asking questions on mailing lists, forums and over IRC chat. Within a couple of weeks, they are ready to do more than just get around. Often, within a month, they’re so much at home with Linux that they begin introducing others to the OS. Astounding? It may seem so—but it’s just that those students were determined to explore and learn, and ignored the cries of, “It’s tough.”
There is always a learning curve involved whenever one is acquiring a new skill, and Linux is no exception. If students are taught to use and program on Linux, they will not just learn, but will also find it simple. It would just seem natural to them—learning something that they did not know earlier. ‘Linux is tough’ is a modern-day myth that has to be busted. If you are an educator, please do your bit. You are the one that students look up to, and if you show them the right way, they will follow your example.

Getting Linux up and running

Okay, once you have decided to use Linux, how do you go about it? You may have heard of lots of different Linux ‘operating systems’ (also called distributions): Ubuntu Linux, Fedora Linux, Debian GNU/Linux and more. Why so many ‘Linuxes’? Let me explain. Technically, ‘Linux’ is the name of a kernel (read http://en.wikipedia.org/wiki/Linux_kernel for more information; see http://www.kernel.org, which is the official home of the kernel). Since a kernel is of little use on its own, user-space tools from the GNU project (including the most common implementation of the C library, a popular shell, and many common UNIX tools that carry out many basic operating system tasks) were combined with the Linux kernel to make a usable operating system. The graphical user interface (or GUI) used by most Linux systems is built on top of an implementation of the X Window System. Different free software projects and vendors build different combinations of packages and features, to provide varying Linux experiences to different target audiences—thus resulting in myriad Linux distributions.
So which Linux distribution should you use? Ubuntu Linux (http://www.ubuntu.com) and Fedora Linux (http://www.fedoraproject.org) both have individually made the Linux experience very user-friendly for casual users of the computer—for Internet surfing, e-mail and document processing needs. Either of these is ideal for you to get started with. Linux installation can be somewhat tricky, though, especially if you intend to set up a dual-boot system where you can boot either Linux or your old Windows. Otherwise, it’s quite simple: download the CD (ISO) image, burn it to a CD-R or RW, boot your computer from it, and let it install! The best way to do a dual-boot set-up the first time is to get hold of someone in your school, locality or office who knows about it, and ask them to guide you. Also, there are other options if you want to try Linux either without installing it, without replacing Windows or doing a dual-boot set-up. See the Dealing with practicalities section towards the end of this article, for some of these ideas.
The Ultimate Linux Newbie Guide (http://www.linuxnewbieguide.org/) is a good reference to help you learn things yourself. With Linux, an experimental approach to learning helps a lot. So, back up your data, and get started with those install discs if you can’t find anyone to help you out. These days, most Linux distributions come with just the essential applications and libraries installed—which probably won’t be sufficient for programming needs. To enable easy installation of new software, most distributions have a package manager (in the Linux world, software is distributed in the form of ‘packages’), which you use to easily download and install new software from the Internet. The Linux Newbie Guide mentioned earlier is a good reference for this topic. So that this article will be of maximum utility, I will try to be more general, and avoid favouring any particular distribution.

Choosing a text editor

We won’t be using an Integrated Development Environment (IDE), at least, initially. We will just do it the simple way: write code using a text editor, save it, and compile/interpret it using an appropriate compiler/interpreter. In the Linux world, you have a plethora of text editors to choose from. One of the editors, such as gedit or kwrite, will definitely be installedwhen you install Linux—you can use either. If you install a distribution like Ubuntu, which has the GNOME desktop
environment, then you will have gedit already installed. It’s just like Notepad, only more useful and feature-rich.

C/C++ programming on Linux

C is usually the first language taught to many students in Indian engineering schools and colleges, so let’s first look at how we program in C on Linux. Note that the C code that you will write on Linux will be the same that you would write on Windows/DOS, as long as you are writing ANSI C code. Some library functions, such as those provided by conio.h and graphics.h, are not part of the ANSI standard. Hence, you won’t be able to use them on Linux. The C compiler you use on Linux is GCC. It is part of the GNU Compiler Collection (http://en.wikipedia.org/wiki/GNU_Compiler_Collection).

Open a terminal and run the command gcc:

$ gcc
gcc: no input files

If you see something like the above output, gcc is already installed. If you see something like “Command not found”, then you will have to install gcc using the package manager. Besides a compiler, you will also need the C standard library, called glibc, to compile your C programs correctly. Type in locate glibc and check the output. If it shows directory structures of the form ‘/foo/bar/glibc’ or the like, then you have glibc installed; else you need to install it.
Okay, now that we have confirmed the presence of a text editor, a compiler and the standard library, let us write our first code in C on Linux. For the purpose of this article, let’s create a sub-directory called ‘codes’ under the home directory, in which we will store all our source code.
Start up gedit and input this simple C code to print the factorial of a number:

#include<stdio.h>
int main(int argc, char **argv)
{
int n, i,fact=1;
printf("Enter a number for which you want to find the factorial:: ");
scanf("%d", &n);
for(i=1;i<=n;i++)
fact=fact*i;
printf("Factorial of %d is :: %d\n", n,fact);
return 0;
}

Save this code in the codes sub-directory with the name ‘fact.c’ and use cd codes to go to this directory in your terminal. Once you are there, issue the following command:

$ gcc factorial.c

After executing the command, run ls and you will see an ‘a.out’ file in the current directory. This is the executable file of your C program, compiled and linked with the appropriate libraries. To execute it, run (note the leading ./, which is essential!):

$ ./a.out

Enter a number for which you want to find the factorial:: 5
Factorial of 5 is :: 120

Congratulations, you have just written your first C program on Linux! That was just the normal C that you write on DOS or Windows—no surprises there! A bit more about this a.out file: This is the Linux equivalent of the .exe file that you would see under DOS/Windows; it is the executable form of your code. As you might have already guessed, this file cannot be executed on DOS or Windows, since it is in a different format. Now,
instead of having to rename your executable file each time you compile, you can specify the output file name to the compiler:

$ gcc -o factorial factorial.c

Try a few more programs from your C programming and data structures classes. ‘The C Programming Language’ is a well-known programming book by Brian Kernighan and Dennis Ritchie, which teaches you C programming with a strong Linux flavour. It would be a good idea to try the
examples and exercise programs from this book to get a flavour of C programming on Linux. Let’s now write our first C++ program on Linux. The cycle of coding, compilation and execution is very similar to that for C, except for the compiler we use, which is g++. Check if it’s already installed by running the command in a terminal, like we did for gcc. Next, use your package manager to check if  libstdc++, the standard C++ library, is installed (if not, install it). Once both are installed, open up gedit and type this simple C++ program:

#include<iostream>
#include<string>
using namespace std;
int main(int argc, char **argv)
{
string s1="Hello";
string s2="World";
cout <<s1+" " + s2 << "\n";
return 0;
}

Save this file as string-demo.cxx in the codes subdirectory.
Compile and execute the file:

$ g++ -o string-demo string-demo.cxx
$ ./string-demo
Hello World

The C++ code you see is standard C++, with the ‘.h’ omitted from the header files. C++ source files conventionally use one of the suffixes ‘.C’, ‘.cc’, ‘.cpp’, ‘.c++’, ‘.cp’, or ‘.cxx’.

Let us now write a simple C++ program that uses classes:

#include<iostream>
using namespace std;
class Circle{

float r;
public:
void init(float x) /* Inline function */
{
r = x;
}
float area();
};
float Circle::area()
{
return 3.14*r*r;
}
int main(int argc, char **argv)
{
float radius;
Circle circle;
cout << "Enter the radius of the circle:: ";
cin >> radius;
circle.init(radius);
cout << "Area of the Circle:: "<<circle.area()<<"\n";
return 0;
}

Save the file in the codes sub-directory as class-demo.cxx.
Compile and execute it:

$ g++ -o class-demo class-demo.cxx
$ ./class-demo

Enter the radius of the circle:: 4
Area of the Circle:: 50.24
Assuming that you have been able to compile these programs successfully, I would now recommend you go ahead and write, compile and test some of your C/C++ assignments and problems using gcc and g++. If you face any issues, you are most welcome to e-mail me personally.
Java programming on Linux Java is perhaps the next most widely taught language in Indian schools and colleges after C/C++. The best part of Java programming on Linux is that you use the same tools that you would use on Windows—yes, the Sun Java Development Kit.
To install the JDK on Linux, download the installer for Linux from http://java.sun.com/javase/downloads/widget/jdk6.jsp.

Choose the .bin file, and not the *rpm.bin file, unless you know what you are doing. (The .bin file is the equivalent of .exe on Windows). Once the download is complete, in your terminal, cd to the directory where the file has been downloaded, and use the following commands:

$ chmod +x jdk-6u18-linux-i586.bin

$ ./jdk-6u18-linux-i586.bin

The file names above might differ depending on the JDK version that you have downloaded. The first line makes the installer executable, and the second line executes it. The installer should start now, and you should see the ‘Sun Microsystems, Inc. Binary Code License Agreement’.
Accept the licence, and the extraction of the JDK should start. Once the installer has exited, you should see a new sub-directory
named ‘jdk1.6.0_18’ inside the current directory. If you are familiar with Java programming on Windows, this should be easily recognisable. Inside this directory is the bin sub-directory, which has the Java compiler (javac), Java interpreter (java), and others. With this, we are all set; let’s write our first Java program on Linux. Fire up gedit and write the following Java code, which shows the usage of an array of integers:

import java.util.Random;
class ArrayDemo {
public static void main(String[] args) {
int[] arr = new int[10];
for(int i=0;i<10;i++)
arr[i] = (new Random()).nextInt();
for(int i=0;i<10;i++)
System.out.println("Element at index " + i + "is::" + arr[i]);
}
}

Save the code to a file ‘ArrayDemo.java’, then compile
and run it as follows:

$ /home/amit/jdk1.6.0_18/bin/javac ArrayDemo.java
$ /home/amit/jdk1.6.0_18/bin/java ArrayDemo
Element at index 0is:: 480763582
Element at index 1is:: -1644219394
Element at index 2is:: -67518401
Element at index 3is:: 619258385
Element at index 4is:: 810878662
Element at index 5is:: 1055578962
Element at index 6is:: 1754667714
Element at index 7is:: 503295725
Element at index 8is:: 1129666934
Element at index 9is:: 1084281888

Note the first two commands, where I have given the full path to the location of the javac and java executables. Depending on where you have extracted the JDK, your path will vary. This is how you can compile, run, test and debug your Java programs.

OpenJDK

An article about Java programming in an open source magazine would be incomplete without talking about OpenJDK (http://openjdk.java.net/). It’s good for you to be aware of this project. As you might have already guessed, it is a GPL-licensed open source implementation of the Java
Standard Edition—i.e., the source code of the JDK that you are so familiar with, is also now available for your scrutiny, in case you don’t like something in the current JDK.

So, is this a different Java? No—you write the same Java code. You can install OpenJDK from your Linux distribution’s package manager (it may come pre-installed with some distributions). See http://openjdk.java.net/install/for installation instructions.
Dealing with practicalities Due to various reasons, deploying Linux lab-wide may not always be possible. In such cases, it’s a good idea to have a single Linux machine in the lab, acting as an SSH server; you can install the necessary SSH client software on other operating systems, which will enable connecting to the Linux machine remotely. This machine should be of a relatively good configuration, depending on how many students will be using it for their coding and compilation—a dual- or quad-core CPU with 4 GB of RAM and a hard disk of at least 320 GB is a good idea.
For Windows, Putty (http://chiark.greenend.org.uk/~sgtatham/putty/download.html) is a widely used SSH client.
If writing the code on Windows and copying it to the Linux machine to compile and run, you will also need to download the pscp program from the site, which lets you copy files from the local machine to the Linux SSH server.
If you need a GUI session from the Linux server to be accessible on the Windows machine (for example, while doing GUI programming) then investigate the OpenNX server (to be installed on the Linux server machine) and the NoMachine NX client for Windows. A machine with the configuration given above should support around 10 user sessions before it starts slowing down. Fine-tuning the desktop manager (use a light one like LXDE or XFCE) and using lighter editors like GVim for writing code, is a good start. Another option (which does not need a dedicated Linux server machine) is to install Linux in a virtual machine on your desktop. This could also prove useful on a home computer. VirtualBox (http://www.virtualbox.org/wiki/Downloads) is virtualisation software that, when installed on your Windows system, will allow you to create a virtual machine, inside which you can install Linux without disrupting your Windows installation.
You will, of course, need some free disk space (8 GB or more) for the virtual machine’s disk file. You don’t need to burn the Linux installation ISO onto a CD in this case—you can simply instruct VirtualBox to use the ISO image file as a disc inserted in the CD-ROM drive of the virtual machine.
This is also a good way to practice installing Linux, and to see how easy it can be. For Ubuntu, in particular, there is Wubi (http://wubiinstaller.org/) which lets you install (and uninstall) Ubuntu like any other Windows application, in a simple and safe way, ‘with a single click’. The Ubuntu files are stored in a single folder in your Windows drive, and an option to boot Ubuntu is added to your Windows boot-loader menu. However, hard-disk access is slightly slower than installation to a dedicated partition.
If your Windows drive is very fragmented, the performance will degenerate further. Hibernation is not supported under Wubi. Moreover, the Wubi filesystem is more vulnerable to hard reboots (turning off the power) and power failures than a normal installation to a dedicated partition, which provides a more robust filesystem that can better tolerate such events.
In general, programming on Linux will also require a decent level of familiarity regarding working with shell commands. Get familiar with working with the shell. Try to minimise the use of the mouse 🙂

Using your favourite IDE on Linux

If you have been using any IDEs for your development
needs, it should be great news that two very popular IDEs—NetBeans and Eclipse—have Linux versions as well, and both of them support C, C++ and Java development. For GNOMEbased Linux distributions, Anjuta DevStudio (http://projects.gnome.org/anjuta/features.html) is another powerful IDE for C, C++ and Java (and other languages too). All three should be available in your distribution’s package manager.
To conclude this article, I would like to urge you to make an honest effort to embrace Linux for programming. It’s a much better world to be in. I would love to address any queries/concerns/comments/suggestions that you may have,regarding this article.

Resources

download

  • ‘graphics.h’ like functionality using ‘gcc’: http://zaher14.

blogspot.com/2007/01/graphicsh-in-linux.html

  • Brian W. Kernighan and Dennis M. Ritchie, “The C

Programming Language”

  • Bjarne Stroustrup, “The C++ Programming Language”
  • Neil Matthew, Richard Stones, “Beginning Linux

Programming”

  • StackOverflow.com is a community forum where you can

post your programming-related questions. It’s languageneutral,
which makes it very attractive.

ssh.html

virtualbox.org/manual/UserManual.html

information.

SOURCE:LINUX FOR YOU(Amit Saha)

Categories: GENERAL