Hi @sasidharp. There could be so many solutions to your problem, that a little bit more or context could be useful.
For example, does you hdfs dataset contains any column or a group of columns, that could be used as indexes? For example, you might have column with a timestamp, and that timestamp is unique for all rows, then you could use that column to create a unique index.
Or maybe you have duplicate timestamps, but you have multiple categories, and there are no rows where both the timestamp and the category are equal: then you could use both columns to create a unique index.
It might as well be very likely that there is a DSS function or feature that allows you to do exactly what you want, and I don't know it, but this is how I would try to solve the problem you describe.
I hope this helps! And if there is a better solution, I think others will post more information.