摘要

Traditional attribute reduction is less effective when applying to large-scale datasets because of its high time and space complexity. In this paper, random sampling is introduced into traditional rough reduction. First, statistical discernibility degree and statistical rough reduction are proposed based on statistical rough approximation. Here the statistical rough reduction is not the traditional reduction any more, it is a subset which keeps the statistical discernibility degree almost invariant. By using random sampling to find the estimated value of statistical discernibility degree, all the condition attributes can be sorted. And then the reduction can be done on the sorted attributes by keeping the statistical discernibility degree almost invariant. Finally, numerical experimental comparison demonstrates that the random sampling based rough reduction is effective on both time and space consumption.

全文