4

I want to run nutch on the linux kernel,I have loged in as a root user, I have setted all the environment variable and nutch file setting. I have created a url.txt file which content the url to crawl, When i am trying to run nutch using following command,

bin/nutch crawl urls -dir pra

it generates following exception.

crawl started in: pra
rootUrlDir = urls
threads = 10
depth = 5
Injector: starting
Injector: crawlDb: pra/crawldb
Injector: urlDir: urls
Injector: Converting injected urls to crawl db entries.
Exception in thread "main" java.io.IOException: Failed to get the current user's information.
        at org.apache.hadoop.mapred.JobClient.getUGI(JobClient.java:717)
        at org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:592)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:788)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1142)
        at org.apache.nutch.crawl.Injector.inject(Injector.java:160)
        at org.apache.nutch.crawl.Crawl.main(Crawl.java:113)
Caused by: javax.security.auth.login.LoginException: Login failed: Cannot run program "whoami": java.io.IOException: error=12, Cannot allocate memory
        at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:250)
        at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:275)
        at org.apache.hadoop.mapred.JobClient.getUGI(JobClient.java:715)
        ... 5 more

Server has enough space to run any java application.I have attached the statics..

            total       used       free  
Mem:        524320     194632     329688 
-/+ buffers/cache:     194632     329688
Swap:      2475680          0    2475680
Total:     3000000     194632    2805368

Is it sufficient memory space for nutch? Please some one help me ,I am new with linux kernel and nutch. Thanks in Advance.

mt3
  • 310
  • 1
  • 3
  • 12
  • 3
    It's a really bad idea to run things as root that don't need to be run as root. Not only is it a security nightmare, but it can interfere with things that *do* need to be run as root, like servers. – Paul Tomblin Dec 28 '09 at 14:25

4 Answers4

2

Read the output:

Cannot run program "whoami": java.io.IOException: error=12, Cannot allocate memory

Looks like you don't have enough RAM or no swap file/partition.

Aaron Digulla
  • 954
  • 1
  • 13
  • 24
  • There is enough memory on server. total used free shared buffers cached Mem: 524320 213700 310620 0 0 0 -/+ buffers/cache: 213700 310620 Swap: 2475680 0 2475680 Total: 3000000 213700 2786300 Can you please suggest me? –  Jan 08 '10 at 18:59
  • At the time when whoami is called, there is no memory left -> You must run the stats command while the tool is running, not afterwards. – Aaron Digulla Jan 10 '10 at 14:05
2

Calls to executables (like whoami) in Java require making an entire copy of the Java process first. You will want to drop your maximum heap size (-Xmx256m) to where you may have two copies in RAM at the same time.

1

In 32 bit installation of an Operating System the JVM(Java Virtual Machine) can not handle memory larger that 4GB. If you want to use JVM to take more than 4GB then you have to use 64bit version of the JVM which also means that the Operating System should also be 64 bit version.
I presume that is why you are getting that error. You have 5GB memory and that could be the problem. You should either tell your application to only use 75% of the available memory or try reducing the RAM to 4GB and checking. I had the same issue in Zimbra Messaging solution which uses Java for the Web interface.

proy
  • 1,179
  • 8
  • 10
0

It is possible that your server has disabled /proc/sys/vm/overcommit_memory. Without overcommit, a "fork" system call requires that your server have enough RAM or swap for a complete second copy of the Java process. This may be a lot of RAM.

Zan Lynx
  • 886
  • 5
  • 13