Where is hadoop examples jar



Okay

I admit I spent lot of time searching this so I am documenting as blog.

Hadoop examples jar is present at following path on Red Hat systems

/usr/lib/hadoop-0.20-mapreduce

Error injecting: de.saumya.mojo.ruby.gems.DefaultGemManager

[WARNING] Error injecting: de.saumya.mojo.ruby.gems.DefaultGemManager
java.lang.NoClassDefFoundError: org/sonatype/aether/RepositorySystemSession
 
Solution
Check the latest version of plugins used in pom. For quick hint about faulty plugin see the link below. 
 
Add in your plugin 2.8 version for dependenvy plugin

<build>
 <pluginManagement>
    <plugins>
    
          <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-dependency-plugin</artifactId>
        <version>2.8</version>
        
         </plugin>

'
'
'

 
Related links
 
https://cwiki.apache.org/confluence/display/MAVEN/AetherClassNotFound
 
https://github.com/torquebox/jruby-maven-plugins/issues/50
 

BigData Media and Entertainment

Bigdata is playing vital role in all the industry verticals now.

Telecom companies are using it to provide better Service to customers by monitoring networks , better network operations.

Transport verticals are trying to optimize and predict accurately the arrival times of airlines to improving baggage handling.

Financial industry has been using it to predict better fraud analysis and many more things.

Health Industry are using it to do better patient care , quick medicine research.

This post tries to analyze the role of Bigdata in Media and entertainment

Lets try to see both sides of coins first

  • Viewer
  • Media and Entertainment content producer
And then in end i will present the technical solution to connect all of these dots based on technologies available in BigData.

Wearing Viewer's hat now

As a viewer today i have lots and lots of choices to spend my leisure time , with Facebook , online music , Youtube , movies , radio you name it. I have so many choices to workout from. And if i don’t like something which is shown , streamed now to me i will switch to next available option on same entertainment channel if i have control over whats being shown ( e.g Youtube ) i can change the current video.

However if i dont have control over whats being shown now i will just switch over the channel ( e.g TV Channel) , If you are showing something which i dont like i will just press the next remote button and bang.

However i would like to make you understand my problems

TV

I don’t want to spend all time time just pressing next next buttons on remote sitting next to TV , i have very less entertainment time which i want to spend enjoying watching something rather then just clicking next. So try to understand my entertainment needs. I have already presented so much data about me , my preferences in my twitter hash tags (e.g #QandA) , facebook posts about you about your content . You cannot still understand me ? My TV is streaming via Internet you cannot show dynamic content based on info you already know about me ? And those ads ...Really ?

Radio

I don’t want that you present  to me ads which are totally irrelevant to my current location , mood , habits , conditions. For example  What will i do with Ad on radio for Pizza shop 1000 miles away from me? Why cannot you show content relevant to me in my locality , current time.

Newspaper

Oh yes this is for online news papers , do you think i will pay for viewing your website in this free world ? Often i wonder why some papers think that's only way for them to earn money. They have failed to connect with local community , the most important revenue stream for print media should be inner zeal in people to buy them. Stay relevant to local community where you are publishing and yet be global. Invite people to connect with you via Opinions , letters and create a vicious circle of attachment with them. Now use that data to produce relevant content for online platform , this will help you to be relevant and monetize your online website. Remember your sole purpose of existence is Humans not Google bot. You have to make such a platform which produce content to which i connect to which i take pleasure interacting and talking to you. Content which is changing based on my location , my interests. And don’t tell me you don’t have this data of mine with you. You already know everything.
See how platforms like Reddit are earning , can you see something there ? Reddit has nothing of own , yet they attract people and make money.


To be Continued...

Failed to open/create the internal network (VERR_INTNET_FLT_IF_NOT_FOUND)

I got this error while using Virtual box on Ubuntu.

Due to this networking was not working properly

Its a bug in Virtualbox
4.2.16
and fixed in 4.1.18

This happens with 3.11 Kernel

http://aptosid.com/index.php?name=PNphpBB2&file=viewtopic&t=2665&view=previous

Failed to open/create the internal network (VERR_INTNET_FLT_IF_NOT_FOUND)

 

 


How to share packages with other ubuntu systems via apt-cacher-ng

I have a host Ubuntu 13.10 system in which I am installing packages via

apt-get install

Now there are many virtual machines which run on top of this host machine.

I want these virtual machines to install from already downloaded packages by the host machine and skip going to the internet first to save my time.

I want to share only which I need and I want to download only once in my host machine.

Solution:

apt-cacher-ng

This creates a proxy that packages are downloaded only once and all machines can quickly download from it.

Here are the install steps

On my base host machine where i want to create apt-cacher-ng proxy

sudo apt-get install apt-cacher-ng

Open the browser to see that its installed successfuly

http://myip:3142

For me its

http://nanak-p570wm:3142

Now i all client machines including the host machine tell them to use proxy to get the packages

touch /etc/apt/apt.conf.d/01proxy

cat > /etc/apt/apt.conf.d/01proxy <<EOF
Acquire::http { Proxy "http://nanak-P570WM:3142"; };
Acquire::ftp { Proxy "ftp://nanak-P570WM:3142"; };
EOF

Thats it , try to test by downloading few things and see the magic


References

http://linuxexpresso.wordpress.com/2011/02/13/howto-apt-cacher-ng-on-ubuntu/
http://www.ubuntugeek.com/apt-cacher-ng-http-download-proxy-for-software-packages.html

 

GPE storm detected , transactions will use polling mode

In my machine i have been experiencing frequent system hang behaviour which i have to stop by hard system boot

OS : Ubuntu 13.10
Hardware Clevo P570WM

CPU Intel(R) Core(TM) i7-3970X CPU @ 3.50GHz

Graphics card
GK104M [GeForce GTX 780M]
driver=nouveau

Wireless

Wireless Intel Corporation 7260
driver=iwlwifi driverversion=3.11.0-7-generic firmware=22.0.7.0

BIOS
American Megatrends Inc.
version: 4.6.5
date: 03/20/2013
capabilities: pci upgrade shadowing cdboot bootselect socketedrom edd



I am trying to diagnose whats in syslogs at time of hang

Here is my ongoing analysis

Till now i have not found solution and problem.

It can be hardware issue
It can be a software issue

I am grouping my syslogs messages at hang time in different posts here on blog and corresponding steps in took.



syslog messages

Sep 14 10:28:55 nanak-P570WM kernel: [ 4855.198902] ACPI: EC: GPE storm detected(9 GPEs), transactions will use polling mode

Read on net about what is GPE storm  , it can be a hardware issue. One of the solution suggested here was to add boot parameter which can tell apci that this is linux machine.

I did following

#Commented as per http://forums.linuxmint.com/viewtopic.php?f=42&t=56323
#GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
GRUB_CMLINE_LINUX_DEFAULT="quiet splash acpi_osi=Linux"

sudo gedit /etc/default/grub

sudo update-grub

Lets see how it goes

Update : It has been more than 24 hours since i had the same issue. Seems things are working with that grub conf change about acpi_osi
Last error i had on 13 Sept 2013 when i changed this setting.

Build and compile Hadoop from source

Install some base libraries

apt-get -y install maven build-essential protobuf-compiler autoconf automake 
libtool cmake zlib1g-dev pkg-config libssl-dev


Checkout code from repo

 svn co http://svn.apache.org/repos/asf/hadoop/common/trunk/ hadoop
cd hadoop/

mvn compile


This gave me

[ERROR] Could not find goal 'protoc' in plugin org.apache.hadoop:hadoop-maven-plugins:3.0.0-SNAPSHOT among available goals -> [Help 1]

https://issues.apache.org/jira/browse/HADOOP-9383

Then to debug i re run with

mvn compile -X

I found in log that i did not had version 2.5 of protobuf required to build hadoop

Follow the steps here to install protobuf on your system

http://jugnu-life.blogspot.com/2013/09/install-protobuf-25-on-ubuntu.html

Check the installed version of protobuf , its also documented in

https://svn.apache.org/repos/asf/hadoop/common/trunk/BUILDING.txt

Okay after fixing above ( if required )

Here is final build command

mvn package -Pdist -DskipTests -Dtar

Further reading


http://wiki.apache.org/hadoop/HowToContribute



Install protobuf 2.5 on Ubuntu

Check which version you have

You can use synaptic package manager for finding it

If you want to install manually

Copy paste in some script below commands

sudo apt-get install build-essential
mkdir /tmp/protobuf_install
cd /tmp/protobuf_install
wget https://github.com/protocolbuffers/protobuf/releases/download/v2.5.0/protobuf-2.5.0.tar.gz
tar xzvf protobuf-2.5.0.tar.gz
cd  protobuf-2.5.0
./configure
make
make check
sudo make install
sudo ldconfig
protoc --version 


Check the installed version

$ protoc --version
libprotoc 2.5.0



Why we need ldconfig read link below
http://code.google.com/p/protobuf/issues/detail?id=213 

Related error searches

[DEBUG] protoc: error while loading shared libraries: libprotoc.so.8:
protoc: error while loading shared libraries: 
 
[DEBUG]  libprotobuf.so.6: cannot open shared 
object file: No such file or directory
 
Cannot find `protoc` command 
 
If you came to this page with above error search then please do
 
sudo ldconfig
 
The details are present on link above
 
 

Change Eclipse java and memory settings

Eclipse can be told to use specific version of Java if you have multiple on your system

Open eclipse.ini file

Add the following in it , see the entry i made in bold

org.eclipse.platform
--launcher.XXMaxPermSize
2560m
--launcher.defaultAction
openFile
-vm
/home/jagat/development/tools/jdk1.6.0_45/bin/java

--launcher.appendVmargs
-vmargs
-Dosgi.requiredJavaVersion=1.6
-XX:MaxPermSize=2560m
-Xms400m
-Xmx5120m

Change memory settings

You can increase the momory allocated for eclipse here.

If you notice carefully , in all of the above i added one extra 0 , e.g 40 i made 400

You can change depending on your system need.

Useful websites for maven

Some of the websites which you can use while working with maven

http://mvnrepository.com

http://search.maven.org




Eclipse menu not working ubuntu 13.10

I downloaded eclipse on Ubuntu 13.10 and when i open eclipse you know that for all applications ubuntu show menu at top bar. Eclipse was also not exception.

However strange thing was i could not do anything with those links.

So i searched and found this bug

https://bugs.launchpad.net/ubuntu/+source/indicator-appmenu/+bug/659931

Read if you like

How to solve

Instead of clicking eclipse directly start it with following command

 UBUNTU_MENUPROXY= /home/jagat/development/tools/eclipse/eclipse

Where the path on right side is exact path where your eclipse is sitting.

This should bring menus back in eclipse like normal OS and it would work.

However i am still thinking why this did not work in Ubuntu




Ubuntu safe remove usb drive

To safely remove the usb disk drive from Ubuntu

Right click on USB Disk drive

Select

Safely remove parent drive


Getting started with Scala IDE Tutorial 1

 

Download the Scala editor

http://scala-ide.org/

Create new Scala Project

image

Create first Object

File > New > Scala Object

image

Write first Class

image

Run As Scala Application

Control + F11

See the output in Console

Congrats : You have run your first Scala IDE program

Lets see next