How To Access Hadoop System Counters Programmatically
List Of Hadoop System Counters
Below is a list of all the Hadoop System counters, along with the Counter Groups, and example values (from my own MapReduce application).
Counter Group: File System Counters (org.apache.hadoop.mapreduce.FileSystemCounter)
- FILE: Number of bytes read: FILE_BYTES_READ: 176727
- FILE: Number of bytes written: FILE_BYTES_WRITTEN: 611042
- FILE: Number of read operations: FILE_READ_OPS: 0
- FILE: Number of large read operations: FILE_LARGE_READ_OPS: 0
- FILE: Number of write operations: FILE_WRITE_OPS: 0
- HDFS: Number of bytes read: HDFS_BYTES_READ: 105677917
- HDFS: Number of bytes written: HDFS_BYTES_WRITTEN: 447
- HDFS: Number of read operations: HDFS_READ_OPS: 6
- HDFS: Number of large read operations: HDFS_LARGE_READ_OPS: 0
- HDFS: Number of write operations: HDFS_WRITE_OPS: 2
Counter Group: Job Counters (org.apache.hadoop.mapreduce.JobCounter)
- Launched map tasks: TOTAL_LAUNCHED_MAPS: 1
- Launched reduce tasks: TOTAL_LAUNCHED_REDUCES: 1
- Rack-local map tasks: RACK_LOCAL_MAPS: 1
- Total time spent by all maps in occupied slots (ms): SLOTS_MILLIS_MAPS: 95592
- Total time spent by all reduces in occupied slots (ms): SLOTS_MILLIS_REDUCES: 11064
- Total time spent by all map tasks (ms): MILLIS_MAPS: 47796
- Total time spent by all reduce tasks (ms): MILLIS_REDUCES: 5532
- Total vcore-seconds taken by all map tasks: VCORES_MILLIS_MAPS: 47796
- Total vcore-seconds taken by all reduce tasks: VCORES_MILLIS_REDUCES: 5532
- Total megabyte-seconds taken by all map tasks: MB_MILLIS_MAPS: 73414656
- Total megabyte-seconds taken by all reduce tasks: MB_MILLIS_REDUCES: 11329536
Counter Group: Map-Reduce Framework (org.apache.hadoop.mapreduce.TaskCounter)
- Map input records: MAP_INPUT_RECORDS: 39129
- Map output records: MAP_OUTPUT_RECORDS: 32295
- Map output bytes: MAP_OUTPUT_BYTES: 370059
- Map output materialized bytes: MAP_OUTPUT_MATERIALIZED_BYTES: 176723
- Input split bytes: SPLIT_RAW_BYTES: 139
- Combine input records: COMBINE_INPUT_RECORDS: 32295
- Combine output records: COMBINE_OUTPUT_RECORDS: 29495
- Reduce input groups: REDUCE_INPUT_GROUPS: 29495
- Reduce shuffle bytes: REDUCE_SHUFFLE_BYTES: 176723
- Reduce input records: REDUCE_INPUT_RECORDS: 29495
- Reduce output records: REDUCE_OUTPUT_RECORDS: 50
- Spilled Records: SPILLED_RECORDS: 58990
- Shuffled Maps : SHUFFLED_MAPS: 1
- Failed Shuffles: FAILED_SHUFFLE: 0
- Merged Map outputs: MERGED_MAP_OUTPUTS: 1
- GC time elapsed (ms): GC_TIME_MILLIS: 603
- CPU time spent (ms): CPU_MILLISECONDS: 59310
- Physical memory (bytes) snapshot: PHYSICAL_MEMORY_BYTES: 1158512640
- Virtual memory (bytes) snapshot: VIRTUAL_MEMORY_BYTES: 6419664896
- Total committed heap usage (bytes): COMMITTED_HEAP_BYTES: 1595932672
- Counter Group: Shuffle Errors (Shuffle Errors)
- BAD_ID: BAD_ID: 0
- CONNECTION: CONNECTION: 0
- IO_ERROR: IO_ERROR: 0
- WRONG_LENGTH: WRONG_LENGTH: 0
- WRONG_MAP: WRONG_MAP: 0
- WRONG_REDUCE: WRONG_REDUCE: 0
Counter Group: File Input Format Counters (org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter)
- Bytes Read: BYTES_READ: 105677778
Counter Group: File Output Format Counters (org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter)
- Bytes Written: BYTES_WRITTEN: 447
Accessing the Counters
You can access a counter on a specific job using the following (you need the Counter Group, and the Counter Name):
job.getCounters().findCounter(
"org.apache.hadoop.mapreduce.FileSystemCounter",
"HDFS_BYTES_READ"
).getValue()
Listing all the Counters
You can use the following code to list all the counters available for your job:
for (CounterGroup group : job.getCounters()) {
System.out.println("* Counter Group: " + group.getDisplayName() + " (" + group.getName() + ")");
System.out.println(" Number of counters in this group: " + group.size());
for (Counter counter : group) {
System.out.println(" - " + counter.getDisplayName() + ": " + counter.getName() + ": "+counter.getValue());
}
}
Hopefully someone finds this information useful!
If you enjoyed this post, please check out my other blog posts.