利用mapreduce中某些基本类的内置比较方法实现倒序排序

mac2022-06-30  20

在mapreudce的类如IntWritable,LongWritable,Text等基本类型都有一个内置的比较函数,而我们可以对其进行修改实现简单的倒序排序(Job.setSortComparatorClass(Class))。默认都是正序的从小到大。 默认比较方法:

//如果是LongWritable类型的直接写成LongWritable.Comparaor即可,以此类推 public static class MyNumberComparator extends IntWritable.Comparator { @Override public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) { return super.compare(b1, s1, l1, b2, s2, l2); } }

而我们将其返回值前面加一个减号就可以实现倒序排序

public static class MyNumberComparator extends IntWritable.Comparator { @Override public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) { return -super.compare(b1, s1, l1, b2, s2, l2); } }

我也编了一个例子并进行了验证可以,代码及结果

package com.mr2; import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.NullWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; //对数字进行排序,定义自己的规则 public class SortOne { public static class MyMapper extends Mapper<LongWritable,Text,IntWritable,NullWritable> { protected void map(LongWritable key,Text value,Context context) throws IOException, InterruptedException { String s = value.toString(); int m = Integer.parseInt(s); context.write(new IntWritable(m), NullWritable.get()); } } public static class MyNumberComparator extends IntWritable.Comparator { @Override public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) { return -super.compare(b1, s1, l1, b2, s2, l2); } } public static void main(String[] args) throws IllegalArgumentException, IOException, ClassNotFoundException, InterruptedException { // TODO Auto-generated method stub Configuration conf = new Configuration(); Job job = Job.getInstance(conf); job.setJarByClass(SortOne.class); job.setMapperClass(MyMapper.class); job.setMapOutputKeyClass(IntWritable.class); job.setMapOutputValueClass(NullWritable.class); //指定特定的比较类 job.setSortComparatorClass(MyNumberComparator.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job,new Path(args[1])); job.waitForCompletion(true); } }

结果: 但我在运行的过程中也发生了错误,错误提示为

java.lang.Exception: java.io.IOException: Initialization of all the collectors failed. Error in last collector was :java.lang.NoSuchMethodException: com.mr2.SortOne$MyNumberComparator.( )

原因是因为我没有使用static修饰 MyNumberComparator类。所有static成员都是在程序装载时初始化的,被分配在一块静态存储区域。 这个区域的成员一旦被分配,就不会再改变地址,直到程序结束才会被释放。在Java机制中,当一个方法或者变量需要进行初始化加载,或者是经常被调用的时候需要用static关键字进行修饰。用static修饰的方法可以用类名直接调用,而不用需要一定要先实例化一个对象然后才可以调用。因为在main函数中是按照类名来调用方法的,所以要将MyNumberComparator类申明为静态的。

最新回复(0)