Re: SELECT DISTINCT records based only on two columns



That would be more troublesome, since there would be two or more UIDs for a duplicated set of fields. Which one to pick?

You could use a GROUP BY, and decide that you want the max( UID ) or min( UID ). Something like this:

SELECT
max( UID )
, FullName = ( FName + ' ' + LName )
, Address1
, Address2 = ( isnull( Address2, '' ))
, CityState = ( City + ', ' + State + ' ' + ZipCode )
FROM #MY_CONTACTS_TBL
GROUP BY
FNAME
, LNAME
, Address1
, Address2
, City
, State
, ZipCode

--
Arnie Rowland, Ph.D.
Westwood Consulting, Inc

Most good judgment comes from experience.
Most experience comes from bad judgment.
- Anonymous


"archuleta37" <archuleta37@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message news:CEAE9B75-4BB7-428C-8B1C-2A012F145DBC@xxxxxxxxxxxxxxxx
Arnie,

You make some good points about the table design. Also, thanks for shedding
a bit more light on how the DISTINCT keyword works. Based on the query you
wrote, I see my mistake was to include the UID field in my query. I don't
think I will actually need that field so that should do the the trick. But
out of curiosity, I wonder how the query could be written if I did need the
UID field. Would I use multiple select statements and the IN keyword or
somthing like that?





"Arnie Rowland" wrote:

First, a suggestion.

Make the ZipCode field a varchar() -either 9 or 10, depending upon what you do with the dash. If you use an int, leading zeros (upper Northeast States) is lost, and 9 digit zipcodes (e.g., 98765-4321) will do the math and store the result.

Similar leading zero problem with SSN

The use of DISTINCT serves two purposes. First, it makes sure that there are no duplicates by testing all of the selected columns as a unit -ignoring capitalization differences (with the standard SQL_Latin1_General_CP1_CI_AS collation). Second, it provides a 'sorted' output since it has to accomplish an ORDER BY just to remove duplicates. However, if there is any character, including space, period, etc., difference, the comparison fails.

CREATE TABLE dbo.#MY_CONTACTS_TBL
( UID smallint IDENTITY(1,1) NOT NULL
, SSN char(9) NULL
, FName varchar(50)
, LName varchar(50)
, Address1 varchar(50)
, Address2 varchar(50)
, City varchar(50)
, State varchar(50)
, Zipcode varchar(10) NULL
)
GO

INSERT INTO #MY_CONTACTS_TBL VALUES ( '123456789','Marc','Archuleta','123 Sesame St','apartment 1','Columbus','OH','43215' )
INSERT INTO #MY_CONTACTS_TBL VALUES ( '234567890','John','Doe','123 Dough St',NULL,'Columbus','OH','43215' )
INSERT INTO #MY_CONTACTS_TBL VALUES ( '345678901','John2','Doe','456 Dough St',NULL,'Columbus','OH','43215' )
INSERT INTO #MY_CONTACTS_TBL VALUES ( '456789012','John','DoeTwo','789 Dough St',NULL,'Columbus','OH','43215' )
INSERT INTO #MY_CONTACTS_TBL VALUES ( '234567890','john','doe','123 dough St',NULL,'columbus','oh','43215' )
INSERT INTO #MY_CONTACTS_TBL VALUES ( '234567890','John','Doe','123 Pumpernickel Ave',NULL,'New York','NY','10016' )
INSERT INTO #MY_CONTACTS_TBL VALUES ( '234567890','John','Doe','123 Dough St.',NULL,'Columbus','OH','43215' )

SELECT DISTINCT
FName + ' ' + LName
, Address1
, isnull( Address2, '' )
, City + ', ' + State + ' ' + ZipCode
FROM #MY_CONTACTS_TBL

--
Arnie Rowland, Ph.D.
Westwood Consulting, Inc

Most good judgment comes from experience.
Most experience comes from bad judgment.
- Anonymous


"archuleta37" <archuleta37@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message news:269374D2-A26E-42B4-8CA1-0BF057600B39@xxxxxxxxxxxxxxxx
I'm working on a mailing list and want to select records where the both the
SSN number and Address fields are unique, since I may want to send a piece of
mail to more than one address for a person, but not multiple pieces to the
same address for that person. I've created a test table and some test data to
try to get my syntax right (below) but so far have had no luck.

I'm not certain if I can do this using DISTINCT or the UNIQUE keyword, nor
how to get my syntax right. Could someone help me with this?

Thanks in advance :-)
Marc


************* CREATE TABLE *****************************
USE [Northwind]
GO
/****** Object: Table [dbo].[MY_CONTACTS_TBL] Script Date: 08/17/2006
09:07:19 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[MY_CONTACTS_TBL](
[UID] [smallint] IDENTITY(1,1) NOT NULL,
[SSN] [bigint] NULL,
[FNAME] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[LNAME] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[ADDRESS1] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[ADDRESS2] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[CITY] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[STATE] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[ZIP] [int] NULL,
CONSTRAINT [PK_MY_CONTACTS_TBL] PRIMARY KEY CLUSTERED
(
[UID] ASC
)WITH (PAD_INDEX = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]

GO
SET ANSI_PADDING OFF

******************* TABLE DATA *****************************
UID,SSN,FNAME,LNAME,ADDRESS1,ADDRESS2,CITY,STATE,ZIP
1,123456789,Marc,Archuleta,123 Sesame St,apartment 1,Columbus,OH,43215
2,234567890,John,Doe,123 Dough St,,Columbus,OH,43215
3,345678901,John2,Doe,456 Dough St,,Columbus,OH,43215
4,456789012,John,DoeTwo,789 Dough St,,Columbus,OH,43215
5,234567890,John,Doe,123 Dough St,,Columbus,OH,43215
6,234567890,John,Doe,123 Pumpernickel Ave,,New York,NY,10016