一些我用到的调试代码
统计每个分区的record个数
List<Tuple2<Integer, Integer>> numOfRecordPerPartition = javaRDD.mapPartitionsWithIndex( |
收集每个分区的第一个record
List<Tuple2<Integer, T>> firstRecordPerPartition = javaRDD.mapPartitionsWithIndex( |
查看当前RDD的PreferredLocations
Partition[] partitions = javaRDD.rdd().getPartitions(); |