247

I have been working with a string[] array in C# that gets returned from a function call. I could possibly cast to a Generic collection, but I was wondering if there was a better way to do it, possibly by using a temp array.

What is the best way to remove duplicates from a C# array?

Stevoisiak
27.7k32 gold badges138 silver badges245 bronze badges
asked Aug 13, 2008 at 11:48
4
  • 5
    Use the Distinct extension method. Commented Aug 13, 2008 at 12:02
  • 1
    Indeed. It's more fun when the array is already sorted - in that case it can be done in-place in O(n) time. Commented Mar 2, 2012 at 20:54
  • @Vitim.us Nope. In my case, it isn't even an array, but a List<string>. I accept any answer that does the job. Perhaps, it's a shock of having to do it on paper. Commented Nov 23, 2012 at 0:05
  • A better way than...? And what's the idea of casting to a generic collection? Either way, to anybody feeling the urge to add yet another answer: keep in mind that the question is not "a way to remove duplicates" as almost everybody did. Any answer should account for time complexity and show benchmarks. So far, only two answers did a serious attempt. Commented Aug 13, 2021 at 12:48

29 Answers 29

498

You could possibly use a LINQ query to do this:

int[] s = { 1, 2, 3, 3, 4};
int[] q = s.Distinct().ToArray();
answered Aug 13, 2008 at 12:03
Sign up to request clarification or add additional context in comments.

4 Comments

Note that you can use an IEqualityComparer as a parameter, such as .Distinct(StringComparer.OrdinalIgnoreCase) to get a case-insensitive distinct set of strings.
Is Distinct honors original order of elements?
@asyrov: from MSDN : The Distinct() method returns an unordered sequence that contains no duplicate values.
What makes this "the best way"?
58

Here is the HashSet<string> approach:

public static string[] RemoveDuplicates(string[] s)
{
 HashSet<string> set = new HashSet<string>(s);
 string[] result = new string[set.Count];
 set.CopyTo(result);
 return result;
}

Unfortunately this solution also requires .NET framework 3.5 or later as HashSet was not added until that version. You could also use array.Distinct(), which is a feature of LINQ.

Fr33dan
4,3673 gold badges38 silver badges65 bronze badges
answered Aug 13, 2008 at 13:50

1 Comment

This probably will not preserve the original order.
11

The following tested and working code will remove duplicates from an array. You must include the System.Collections namespace.

string[] sArray = {"a", "b", "b", "c", "c", "d", "e", "f", "f"};
var sList = new ArrayList();
for (int i = 0; i < sArray.Length; i++) {
 if (sList.Contains(sArray[i]) == false) {
 sList.Add(sArray[i]);
 }
}
var sNew = sList.ToArray();
for (int i = 0; i < sNew.Length; i++) {
 Console.Write(sNew[i]);
}

You could wrap this up into a function if you wanted to.

answered Aug 13, 2008 at 13:04

1 Comment

This appears to be O(N^2)... You could use a heap instead of an ArrayList
10

If you needed to sort it, then you could implement a sort that also removes duplicates.

Kills two birds with one stone, then.

answered Aug 13, 2008 at 11:51

4 Comments

How does sorting remove duplicates?
Who voted this up? This isn't an answer. "How do I make pancakes?" "Put some ingredients in a bow and mix."
Correct, it is indeed not an answer. It was a comment, made before StackOverflow had comments, I believe. This question was asked when there were fewer than 10k questions on SO.
A helpful comment because I wanted to both sort and remove duplicates from my collection, and this comment gave me the idea to switch from a List<T> to a SortedSet<T> which solved my problem very succinctly.
8

This might depend on how much you want to engineer the solution - if the array is never going to be that big and you don't care about sorting the list you might want to try something similar to the following:

 public string[] RemoveDuplicates(string[] myList) {
 System.Collections.ArrayList newList = new System.Collections.ArrayList();
 foreach (string str in myList)
 if (!newList.Contains(str))
 newList.Add(str);
 return (string[])newList.ToArray(typeof(string));
 }
answered Aug 13, 2008 at 13:08

1 Comment

You should use List instead of ArrayList.
7
List<String> myStringList = new List<string>();
foreach (string s in myStringArray)
{
 if (!myStringList.Contains(s))
 {
 myStringList.Add(s);
 }
}

This is O(n^2), which won't matter for a short list which is going to be stuffed into a combo, but could be rapidly be a problem on a big collection.

Chandan Kumar
4,6228 gold badges45 silver badges63 bronze badges
answered Aug 13, 2008 at 14:19

Comments

6

-- This is Interview Question asked every time. Now i done its coding.

static void Main(string[] args)
{ 
 int[] array = new int[] { 4, 8, 4, 1, 1, 4, 8 }; 
 int numDups = 0, prevIndex = 0;
 for (int i = 0; i < array.Length; i++)
 {
 bool foundDup = false;
 for (int j = 0; j < i; j++)
 {
 if (array[i] == array[j])
 {
 foundDup = true;
 numDups++; // Increment means Count for Duplicate found in array.
 break;
 } 
 }
 if (foundDup == false)
 {
 array[prevIndex] = array[i];
 prevIndex++;
 }
 }
 // Just Duplicate records replce by zero.
 for (int k = 1; k <= numDups; k++)
 { 
 array[array.Length - k] = '0円'; 
 }
 Console.WriteLine("Console program for Remove duplicates from array.");
 Console.Read();
 }

2 Comments

You shouldn't do a O(n*2) time complexity for an this question.
You should use merge sort
6

Here is a O(n*n) approach that uses O(1) space.

void removeDuplicates(char* strIn)
{
 int numDups = 0, prevIndex = 0;
 if(NULL != strIn && *strIn != '0円')
 {
 int len = strlen(strIn);
 for(int i = 0; i < len; i++)
 {
 bool foundDup = false;
 for(int j = 0; j < i; j++)
 {
 if(strIn[j] == strIn[i])
 {
 foundDup = true;
 numDups++;
 break;
 }
 }
 if(foundDup == false)
 {
 strIn[prevIndex] = strIn[i];
 prevIndex++;
 }
 }
 strIn[len-numDups] = '0円';
 }
}

The hash/linq approaches above are what you would generally use in real life. However in interviews they usually want to put some constraints e.g. constant space which rules out hash or no internal api - which rules out using LINQ.

Chandan Kumar
4,6228 gold badges45 silver badges63 bronze badges
answered Dec 6, 2009 at 17:50

3 Comments

How can it ever use O(1) space, when you have to store the entire list? By starting with an inplace sort, you can do O(nlogn) time and O(n) memory, with much less code.
What makes you think it is storing the entire list? It is indeed doing in-place. And though not a condition in the question, my code maintains the order of characters in the original string. Sorting will remove that.
The inner loop (strIn[j] == strIn[i]) will be comparing a string to itself unless accounted for with an if statement.
5
protected void Page_Load(object sender, EventArgs e)
{
 string a = "a;b;c;d;e;v";
 string[] b = a.Split(';');
 string[] c = b.Distinct().ToArray();
 if (b.Length != c.Length)
 {
 for (int i = 0; i < b.Length; i++)
 {
 try
 {
 if (b[i].ToString() != c[i].ToString())
 {
 Response.Write("Found duplicate " + b[i].ToString());
 return;
 }
 }
 catch (Exception ex)
 {
 Response.Write("Found duplicate " + b[i].ToString());
 return;
 }
 } 
 }
 else
 {
 Response.Write("No duplicate ");
 }
}
Aliaksei Kliuchnikau
13.7k4 gold badges63 silver badges74 bronze badges
answered Feb 14, 2012 at 6:07

Comments

4

Add all the strings to a dictionary and get the Keys property afterwards. This will produce each unique string, but not necessarily in the same order your original input had them in.

If you require the end result to have the same order as the original input, when you consider the first occurance of each string, use the following algorithm instead:

  1. Have a list (final output) and a dictionary (to check for duplicates)
  2. For each string in the input, check if it exists in the dictionary already
  3. If not, add it both to the dictionary and to the list

At the end, the list contains the first occurance of each unique string.

Make sure you consider things like culture and such when constructing your dictionary, to make sure you handle duplicates with accented letters correctly.

answered Aug 13, 2008 at 12:53

Comments

4

The following piece of code attempts to remove duplicates from an ArrayList though this is not an optimal solution. I was asked this question during an interview to remove duplicates through recursion, and without using a second/temp arraylist:

private void RemoveDuplicate() 
{
ArrayList dataArray = new ArrayList(5);
 dataArray.Add("1");
 dataArray.Add("1");
 dataArray.Add("6");
 dataArray.Add("6");
 dataArray.Add("6");
 dataArray.Add("3");
 dataArray.Add("6");
 dataArray.Add("4");
 dataArray.Add("5");
 dataArray.Add("4");
 dataArray.Add("1");
 dataArray.Sort();
 GetDistinctArrayList(dataArray, 0);
}
private void GetDistinctArrayList(ArrayList arr, int idx)
{
 int count = 0;
 if (idx >= arr.Count) return;
 string val = arr[idx].ToString();
 foreach (String s in arr)
 {
 if (s.Equals(arr[idx]))
 {
 count++;
 }
 }
 if (count > 1)
 {
 arr.Remove(val);
 GetDistinctArrayList(arr, idx);
 }
 else
 {
 idx += 1;
 GetDistinctArrayList(arr, idx);
 }
 }
IAbstract
19.9k17 gold badges103 silver badges152 bronze badges
answered Apr 15, 2010 at 9:31

Comments

4

Simple solution:

using System.Linq;
...
public static int[] Distinct(int[] handles)
{
 return handles.ToList().Distinct().ToArray();
}
Sede
61.5k20 gold badges158 silver badges162 bronze badges
answered Mar 12, 2015 at 21:21

Comments

4

Maybe hashset which do not store duplicate elements and silently ignore requests to add duplicates.

static void Main()
{
 string textWithDuplicates = "aaabbcccggg"; 
 Console.WriteLine(textWithDuplicates.Count()); 
 var letters = new HashSet<char>(textWithDuplicates);
 Console.WriteLine(letters.Count());
 foreach (char c in letters) Console.Write(c);
 Console.WriteLine("");
 int[] array = new int[] { 12, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2 };
 Console.WriteLine(array.Count());
 var distinctArray = new HashSet<int>(array);
 Console.WriteLine(distinctArray.Count());
 foreach (int i in distinctArray) Console.Write(i + ",");
}
Chandan Kumar
4,6228 gold badges45 silver badges63 bronze badges
answered Dec 21, 2012 at 20:33

Comments

3

NOTE : NOT tested!

string[] test(string[] myStringArray)
{
 List<String> myStringList = new List<string>();
 foreach (string s in myStringArray)
 {
 if (!myStringList.Contains(s))
 {
 myStringList.Add(s);
 }
 }
 return myStringList.ToString();
}

Might do what you need...

EDIT Argh!!! beaten to it by rob by under a minute!

answered Aug 13, 2008 at 13:09

1 Comment

Rob didn't beat you to anything. He's using ArrayList, while you're using List. Your version is better.
3

Tested the below & it works. What's cool is that it does a culture sensitive search too

class RemoveDuplicatesInString
{
 public static String RemoveDups(String origString)
 {
 String outString = null;
 int readIndex = 0;
 CompareInfo ci = CultureInfo.CurrentCulture.CompareInfo;
 if(String.IsNullOrEmpty(origString))
 {
 return outString;
 }
 foreach (var ch in origString)
 {
 if (readIndex == 0)
 {
 outString = String.Concat(ch);
 readIndex++;
 continue;
 }
 if (ci.IndexOf(origString, ch.ToString().ToLower(), 0, readIndex) == -1)
 {
 //Unique char as this char wasn't found earlier.
 outString = String.Concat(outString, ch); 
 }
 readIndex++;
 }
 return outString;
 }
 static void Main(string[] args)
 {
 String inputString = "aAbcefc";
 String outputString;
 outputString = RemoveDups(inputString);
 Console.WriteLine(outputString);
 }

}

--AptSenSDET

answered Jun 15, 2013 at 7:18

Comments

3

This code 100% remove duplicate values from an array[as I used a[i]].....You can convert it in any OO language..... :)

for(int i=0;i<size;i++)
{
 for(int j=i+1;j<size;j++)
 {
 if(a[i] == a[j])
 {
 for(int k=j;k<size;k++)
 {
 a[k]=a[k+1];
 }
 j--;
 size--;
 }
 }
}
answered Jul 5, 2014 at 14:14

Comments

1

you can using This code when work with an ArrayList

ArrayList arrayList;
//Add some Members :)
arrayList.Add("ali");
arrayList.Add("hadi");
arrayList.Add("ali");
//Remove duplicates from array
 for (int i = 0; i < arrayList.Count; i++)
 {
 for (int j = i + 1; j < arrayList.Count ; j++)
 if (arrayList[i].ToString() == arrayList[j].ToString())
 arrayList.Remove(arrayList[j]);
answered Apr 12, 2015 at 9:12

Comments

1

Below is an simple logic in java you traverse elements of array twice and if you see any same element you assign zero to it plus you don't touch the index of element you are comparing.

import java.util.*;
class removeDuplicate{
int [] y ;
public removeDuplicate(int[] array){
 y=array;
 for(int b=0;b<y.length;b++){
 int temp = y[b];
 for(int v=0;v<y.length;v++){
 if( b!=v && temp==y[v]){
 y[v]=0;
 }
 }
 }
}
Liam
30k28 gold badges142 silver badges204 bronze badges
answered Oct 5, 2016 at 8:14

Comments

1
public static int RemoveDuplicates(ref int[] array)
{
 int size = array.Length;
 // if 0 or 1, return 0 or 1:
 if (size < 2) {
 return size;
 }
 int current = 0;
 for (int candidate = 1; candidate < size; ++candidate) {
 if (array[current] != array[candidate]) {
 array[++current] = array[candidate];
 }
 }
 // index to count conversion:
 return ++current;
}
answered Feb 2, 2016 at 5:21

Comments

1

Generic Extension method :

public static IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource> source, IEqualityComparer<TSource> comparer)
{
 if (source == null)
 throw new ArgumentNullException(nameof(source));
 HashSet<TSource> set = new HashSet<TSource>(comparer);
 foreach (TSource item in source)
 {
 if (set.Add(item))
 {
 yield return item;
 }
 }
}
answered Oct 13, 2018 at 4:32

Comments

1

The best way? Hard to say, the HashSet approach looks fast, but (depending on the data) using a sort algorithm (CountSort ?) can be much faster.

using System;
using System.Collections.Generic;
using System.Linq;
class Program
{
 static void Main()
 {
 Random r = new Random(0); int[] a, b = new int[1000000];
 for (int i = b.Length - 1; i >= 0; i--) b[i] = r.Next(b.Length);
 a = new int[b.Length]; Array.Copy(b, a, b.Length);
 a = dedup0(a); Console.WriteLine(a.Length);
 a = new int[b.Length]; Array.Copy(b, a, b.Length);
 var w = System.Diagnostics.Stopwatch.StartNew();
 a = dedup0(a); Console.WriteLine(w.Elapsed); Console.Read();
 }
 static int[] dedup0(int[] a) // 48 ms 
 {
 return new HashSet<int>(a).ToArray();
 }
 static int[] dedup1(int[] a) // 68 ms
 {
 Array.Sort(a); int i = 0, j = 1, k = a.Length; if (k < 2) return a;
 while (j < k) if (a[i] == a[j]) j++; else a[++i] = a[j++];
 Array.Resize(ref a, i + 1); return a;
 }
 static int[] dedup2(int[] a) // 8 ms
 {
 var b = new byte[a.Length]; int c = 0;
 for (int i = 0; i < a.Length; i++) 
 if (b[a[i]] == 0) { b[a[i]] = 1; c++; }
 a = new int[c];
 for (int j = 0, i = 0; i < b.Length; i++) if (b[i] > 0) a[j++] = i;
 return a;
 }
}

Almost branch free. How? Debug mode, Step Into (F11) with a small array: {1,3,1,1,0}

 static int[] dedupf(int[] a) // 4 ms
 {
 if (a.Length < 2) return a;
 var b = new byte[a.Length]; int c = 0, bi, ai, i, j;
 for (i = 0; i < a.Length; i++)
 { ai = a[i]; bi = 1 ^ b[ai]; b[ai] |= (byte)bi; c += bi; }
 a = new int[c]; i = 0; while (b[i] == 0) i++; a[0] = i++;
 for (j = 0; i < b.Length; i++) a[j += bi = b[i]] += bi * i; return a;
 }

A solution with two nested loops might take some time, especially for larger arrays.

 static int[] dedup(int[] a)
 {
 int i, j, k = a.Length - 1;
 for (i = 0; i < k; i++)
 for (j = i + 1; j <= k; j++) if (a[i] == a[j]) a[j--] = a[k--];
 Array.Resize(ref a, k + 1); return a;
 }
answered May 24, 2020 at 8:57

Comments

0
 private static string[] distinct(string[] inputArray)
 {
 bool alreadyExists;
 string[] outputArray = new string[] {};
 for (int i = 0; i < inputArray.Length; i++)
 {
 alreadyExists = false;
 for (int j = 0; j < outputArray.Length; j++)
 {
 if (inputArray[i] == outputArray[j])
 alreadyExists = true;
 }
 if (alreadyExists==false)
 {
 Array.Resize<string>(ref outputArray, outputArray.Length + 1);
 outputArray[outputArray.Length-1] = inputArray[i];
 }
 }
 return outputArray;
 }
answered Nov 30, 2017 at 8:03

1 Comment

explain your answer, please.
0
int size = a.Length;
 for (int i = 0; i < size; i++)
 {
 for (int j = i + 1; j < size; j++)
 {
 if (a[i] == a[j])
 {
 for (int k = j; k < size; k++)
 {
 if (k != size - 1)
 {
 int temp = a[k];
 a[k] = a[k + 1];
 a[k + 1] = temp;
 }
 }
 j--;
 size--;
 }
 }
 }
answered Jun 6, 2020 at 23:58

3 Comments

Welcome to SO. While this code snippet may be the solution, including an explanation really helps to improve the quality of your post. Remember that you are answering the question for readers in the future, and those people might not know the reasons for your code suggestion.
Regretfully this code does not remove anything, so it does not remove duplicates.
Regretfully the coder does't remove anything either :)
0

Removing duplicate and ignore case sensitive using Distinct & StringComparer.InvariantCultureIgnoreCase

string[] array = new string[] { "A", "a", "b", "B", "a", "C", "c", "C", "A", "1" };
var r = array.Distinct(StringComparer.InvariantCultureIgnoreCase).ToList();
Console.WriteLine(r.Count); // return 4 items
answered Jul 30, 2020 at 4:47

2 Comments

The question is "What is the best way to remove duplicates from a C# array?". You don't answer that question.
well read again the question "How do I remove duplicates from a C# array?"
0

Find answer below.

class Program
{
 static void Main(string[] args)
 {
 var nums = new int[] { 1, 4, 3, 3, 3, 5, 5, 7, 7, 7, 7, 9, 9, 9 };
 var result = removeDuplicates(nums);
 foreach (var item in result)
 {
 Console.WriteLine(item);
 }
 }
 static int[] removeDuplicates(int[] nums)
 {
 nums = nums.ToList().OrderBy(c => c).ToArray();
 int j = 1;
 int i = 0;
 int stop = 0;
 while (j < nums.Length)
 {
 if (nums[i] != nums[j])
 {
 nums[i + 1] = nums[j];
 stop = i + 2;
 i++;
 }
 j++;
 }
 nums = nums.Take(stop).ToArray();
 return nums;
 }
}

Just a bit of contribution based on a test i just solved, maybe helpful and open to improvement by other top contributors here. Here are the things i did:

  1. I used OrderBy which allows me order or sort the items from smallest to the highest using LINQ
  2. I then convert it to back to an array and then re-assign it back to the primary datasource
  3. So i then initialize j which is my right hand side of the array to be 1 and i which is my left hand side of the array to be 0, i also initialize where i would i to stop to be 0.
  4. I used a while loop to increment through the array by going from one position to the other left to right, for each increment the stop position is the current value of i + 2 which i will use later to truncate the duplicates from the array.
  5. I then increment by moving from left to right from the if statement and from right to right outside of the if statement until i iterate through the entire values of the array.
  6. I then pick from the first element to the stop position which becomes the last i index plus 2. that way i am able to remove all the duplicate items from the int array. which is then reassigned.
answered Oct 25, 2020 at 0:10

Comments

0

So I was doing an interview session and got the same question to sort and distinct

static void Sort()
 {
 try
 {
 int[] number = new int[Convert.ToInt32(Console.ReadLine())];
 for (int i = 0; i < number.Length; i++)
 {
 number[i] = Convert.ToInt32(Console.ReadLine());
 }
 Array.Sort(number);
 int[] num = number.Distinct().ToArray();
 for (int i = 0; i < num.Length; i++)
 {
 Console.WriteLine(num[i]);
 }
 }
 catch (Exception ex)
 {
 Console.WriteLine(ex);
 }
 Console.Read();
 }
answered Aug 17, 2021 at 11:15

Comments

0

In my case, with array of objects it worked like this. Choosing which attribute you want to use as a filter;


var objectArrayDuplicate = new List<Usuario>();
objectArrayDuplicate.Add(new Usuario { Nome = "Jose", Id = 5 });
objectArrayDuplicate.Add(new Usuario { Nome = "Jose", Id = 5 });
objectArrayDuplicate.Add(new Usuario { Nome = "Maria", Id = 3 });
objectArrayDuplicate.Add(new Usuario { Nome = "Josh", Id = 6 });
 List<Usuario> objectArray = 
objectArrayDuplicate.GroupBy(o => o.Nome).Select(o => o.First()).ToList();
abolfazl sadeghi
2,4182 gold badges15 silver badges23 bronze badges
answered May 5, 2023 at 18:35

Comments

-1
using System;
using System.Collections.Generic;
using System.Linq;
namespace Rextester
{
 public class Program
 {
 public static void Main(string[] args)
 {
 List<int> listofint1 = new List<int> { 4, 8, 4, 1, 1, 4, 8 };
 List<int> updatedlist= removeduplicate(listofint1);
 foreach(int num in updatedlist)
 Console.WriteLine(num);
 }
 public static List<int> removeduplicate(List<int> listofint)
 {
 List<int> listofintwithoutduplicate= new List<int>();
 foreach(var num in listofint)
 {
 if(!listofintwithoutduplicate.Any(p=>p==num))
 {
 listofintwithoutduplicate.Add(num);
 }
 }
 return listofintwithoutduplicate;
 }
 }
}
answered Apr 25, 2019 at 17:28

1 Comment

This is a very inefficient way of doing this. Have a look at the other answers to see what they do.
-1
strINvalues = "1,1,2,2,3,3,4,4";
strINvalues = string.Join(",", strINvalues .Split(',').Distinct().ToArray());
Debug.Writeline(strINvalues);

Kkk Not sure if this is witchcraft or just beautiful code

1 strINvalues .Split(',').Distinct().ToArray()

2 string.Join(",", XXX);

1 Splitting the array and using Distinct [LINQ] to remove duplicates 2 Joining it back without the duplicates.

Sorry I never read the text on StackOverFlow just the code. it make more sense than the text ;)

answered Oct 3, 2019 at 16:23

2 Comments

Code-only answers are low-quality answers. Add some explanation to why this works.
The question is "What is the best way to remove duplicates from a C# array?". You don't answer that question.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.