Sample output of Rumen or Input to Gridmix

Question

I want to see JobHistory logs, which can be fed as input to the Rumen. More specifically, I am interested in knowing input format for the Gridmix.

I tried following two things for it:

1) I found this files: . What is this file exactly? Is this output format of Rumen? Is a file similar to this enough input for Gridmix?

2) Another thing I tried for seeing JobHistory logs is setup hadoop and execute and see some logs. However, I setting up hadoop(Yarn and Map Reduce) for the first time. So I have no knowledge regarding it's setup. I am using the version 3.0.0 of Hadoop Yarn.

I am doing my execution in Pseudo-Distributed_Operation (Are JobHistory logs generated in Pseudo-Distributed_Operation?).

I have enabled Yarn log aggregation as suggested here.

This article talks about where the logs can be found. It says that it is in the following directory in hdfs.

/user/uname/.staging/job_id/

However, I am not able to find this directory. Following is what I get as the error message.

bin/hdfs dfs -ls /user/uname/.staging
ls: `/user/uname/.staging': No such file or directory

I tried searching for the /user/uname/.staging in the local file system too. However, I got the same directory doesn't exist error.(This was obvious as I didn't create any.) However, I have created /user/uname in the HDFS system(while doing the setup and /user/uname/ does exist but the .staging sub directory in it is not found).

This stackoverflow answer does talk about the similar issue, but is not very clear that how can I adapt it to my problem and also this is for an older version of Hadoop and it appears that it will not work for Hadoop-3.0.0

It would be great if someone can point me to some sample Gridmix Input or Rumen(Input or Output) or help me figure out where the log files generated by my executions are going.

Sample output of Rumen or Input to Gridmix

0 Answers0