Sunday, March 25, 2012

COUNT function not working with GROUP BY

The following SQL statement returns 38 records:
SELECT Base_Contacts.ID_Contact AS CONTACTS
FROM Base_Contacts LEFT OUTER JOIN
Base_ContactAttributes ON Base_Contacts.ID_Contact =
Base_ContactAttributes.ID_Contact
WHERE (Base_ContactAttributes.ID_Level_Value IN (10150, 10153))
GROUP BY Base_Contacts.ID_Contact
HAVING (COUNT(DISTINCT Base_ContactAttributes.ID_Level_Value) = 2)
Rather than retrieving the recordset, I want to simply get a count. My
syntax must be wrong (same syntax as above but I added COUNT) because the
following query simply returns 38 records with the value "1":
SELECT COUNT(DISTINCT Base_Contacts.ID_Contact) AS CONTACTS
FROM Base_Contacts LEFT OUTER JOIN
Base_ContactAttributes ON Base_Contacts.ID_Contact =
Base_ContactAttributes.ID_Contact
WHERE (Base_ContactAttributes.ID_Level_Value IN (10150, 10153))
GROUP BY Base_Contacts.ID_Contact
HAVING (COUNT(DISTINCT Base_ContactAttributes.ID_Level_Value) = 2)
Any suggestions for this beginner? Thanks!
Hi
GROUP BY always returns 1 row for each different value of the GROUP BY
column, so you're still getting one row for each different
Base_Contacts.ID_Contact
value. If you want the count of all values, don't use GROUP BY.
I'm not exactly sure what you want to do with
Base_ContactAttributes.ID_Level_Value in the new query. You'll need to give
us more info, and maybe even some sample data and sample output, or at least
an explanation of what you're doing with that value if you want the rest of
the solution.
HTH
Kalen Delaney
SQL Server MVP
www.SolidQualityLearning.com
"gstoa" <gstoa@.discussions.microsoft.com> wrote in message
news:FC26E396-3FFF-408E-B326-F5DA12FACE3B@.microsoft.com...
> The following SQL statement returns 38 records:
> SELECT Base_Contacts.ID_Contact AS CONTACTS
> FROM Base_Contacts LEFT OUTER JOIN
> Base_ContactAttributes ON Base_Contacts.ID_Contact =
> Base_ContactAttributes.ID_Contact
> WHERE (Base_ContactAttributes.ID_Level_Value IN (10150, 10153))
> GROUP BY Base_Contacts.ID_Contact
> HAVING (COUNT(DISTINCT Base_ContactAttributes.ID_Level_Value) = 2)
> Rather than retrieving the recordset, I want to simply get a count. My
> syntax must be wrong (same syntax as above but I added COUNT) because the
> following query simply returns 38 records with the value "1":
> SELECT COUNT(DISTINCT Base_Contacts.ID_Contact) AS CONTACTS
> FROM Base_Contacts LEFT OUTER JOIN
> Base_ContactAttributes ON Base_Contacts.ID_Contact =
> Base_ContactAttributes.ID_Contact
> WHERE (Base_ContactAttributes.ID_Level_Value IN (10150, 10153))
> GROUP BY Base_Contacts.ID_Contact
> HAVING (COUNT(DISTINCT Base_ContactAttributes.ID_Level_Value) = 2)
> Any suggestions for this beginner? Thanks!
|||Remove your GROUP BY clause. You're counting the number in every group of
ID_Contacts.
"gstoa" <gstoa@.discussions.microsoft.com> wrote in message
news:FC26E396-3FFF-408E-B326-F5DA12FACE3B@.microsoft.com...
> The following SQL statement returns 38 records:
> SELECT Base_Contacts.ID_Contact AS CONTACTS
> FROM Base_Contacts LEFT OUTER JOIN
> Base_ContactAttributes ON Base_Contacts.ID_Contact =
> Base_ContactAttributes.ID_Contact
> WHERE (Base_ContactAttributes.ID_Level_Value IN (10150, 10153))
> GROUP BY Base_Contacts.ID_Contact
> HAVING (COUNT(DISTINCT Base_ContactAttributes.ID_Level_Value) = 2)
> Rather than retrieving the recordset, I want to simply get a count. My
> syntax must be wrong (same syntax as above but I added COUNT) because the
> following query simply returns 38 records with the value "1":
> SELECT COUNT(DISTINCT Base_Contacts.ID_Contact) AS CONTACTS
> FROM Base_Contacts LEFT OUTER JOIN
> Base_ContactAttributes ON Base_Contacts.ID_Contact =
> Base_ContactAttributes.ID_Contact
> WHERE (Base_ContactAttributes.ID_Level_Value IN (10150, 10153))
> GROUP BY Base_Contacts.ID_Contact
> HAVING (COUNT(DISTINCT Base_ContactAttributes.ID_Level_Value) = 2)
> Any suggestions for this beginner? Thanks!
|||Here's some sample data:
CREATE TABLE mypeople
(
id_person int IDENTITY (1, 1) NOT NULL,
name varchar(10) NULL
)
CREATE TABLE myattributes
(
id_person int NOT NULL,
id_attribute int NOT NULL
) ON [PRIMARY]
INSERT INTO mypeople VALUES('John')
INSERT INTO mypeople VALUES('Bill')
INSERT INTO mypeople VALUES('Jane')
INSERT INTO myattributes VALUES(1,1000)
INSERT INTO myattributes VALUES(1,1002)
INSERT INTO myattributes VALUES(1,1004)
INSERT INTO myattributes VALUES(1,1006)
INSERT INTO myattributes VALUES(1,1008)
INSERT INTO myattributes VALUES(2,1002)
INSERT INTO myattributes VALUES(2,1004)
INSERT INTO myattributes VALUES(3,1004)
Now my count distinct query is as follows:
SELECT COUNT(DISTINCT dbo.mypeople.name) AS Expr1
FROM dbo.mypeople INNER JOIN
dbo.myattributes ON dbo.mypeople.id_person =
dbo.myattributes.id_person
WHERE (dbo.myattributes.id_attribute IN (1002, 1004))
HAVING (COUNT(DISTINCT dbo.myattributes.id_attribute) = 2)
Notice that the count returned from this query is 3. Unless I'm not
understanding the query logic correctly, I would have thought that only two
people met the IN/HAVING clause criteria. In this example, only John and
Bill have attribute values of both 1002 and 1004. I would have thought that
this
query would have returned the value of 2 -- not 3 ?
"Kalen Delaney" wrote:

> Hi
> GROUP BY always returns 1 row for each different value of the GROUP BY
> column, so you're still getting one row for each different
> Base_Contacts.ID_Contact
> value. If you want the count of all values, don't use GROUP BY.
> I'm not exactly sure what you want to do with
> Base_ContactAttributes.ID_Level_Value in the new query. You'll need to give
> us more info, and maybe even some sample data and sample output, or at least
> an explanation of what you're doing with that value if you want the rest of
> the solution.
> --
> HTH
> --
> Kalen Delaney
> SQL Server MVP
> www.SolidQualityLearning.com
>
> "gstoa" <gstoa@.discussions.microsoft.com> wrote in message
> news:FC26E396-3FFF-408E-B326-F5DA12FACE3B@.microsoft.com...
>
>
|||Thanks for the nice script. In your case, the count in the SELECT list is
counting the number of people who have any rows returned with values of
either 1002 or 1004 and so that is all the people, 3 of them.
I'm not sure exactly what the HAVING is doing here, but I do not think it
means what you think it means. I'll have to think about it to figure out
what it means here, or I'll just ask Itzik. :-)
In the meantime, you can rewrite this. You really don't need the mypeople
table at all, since you don't need any info that is just in that table. This
query gives you the list of id_person values that have exactly the two
attributes you need:
SELECT dbo.myattributes.id_person, COUNT(*) AS Expr1
FROM dbo.myattributes
WHERE (dbo.myattributes.id_attribute IN (1002, 1004))
GROUP BY dbo.myattributes.id_person
HAVING COUNT(*) = 2
So, we can just make that a derived table, and count the rows in it:
SELECT count(*) FROM
(SELECT dbo.myattributes.id_person, COUNT(*) AS Expr1
FROM dbo.myattributes
WHERE (dbo.myattributes.id_attribute IN (1002, 1004))
GROUP BY dbo.myattributes.id_person
HAVING COUNT(*) = 2) AS counts
HTH
Kalen Delaney
SQL Server MVP
www.SolidQualityLearning.com
"gstoa" <gstoa@.discussions.microsoft.com> wrote in message
news:DB1E33DD-5BE0-46A4-8511-668EC549DC7D@.microsoft.com...
> Here's some sample data:
> CREATE TABLE mypeople
> (
> id_person int IDENTITY (1, 1) NOT NULL,
> name varchar(10) NULL
> )
> CREATE TABLE myattributes
> (
> id_person int NOT NULL,
> id_attribute int NOT NULL
> ) ON [PRIMARY]
> INSERT INTO mypeople VALUES('John')
> INSERT INTO mypeople VALUES('Bill')
> INSERT INTO mypeople VALUES('Jane')
> INSERT INTO myattributes VALUES(1,1000)
> INSERT INTO myattributes VALUES(1,1002)
> INSERT INTO myattributes VALUES(1,1004)
> INSERT INTO myattributes VALUES(1,1006)
> INSERT INTO myattributes VALUES(1,1008)
> INSERT INTO myattributes VALUES(2,1002)
> INSERT INTO myattributes VALUES(2,1004)
> INSERT INTO myattributes VALUES(3,1004)
> Now my count distinct query is as follows:
> SELECT COUNT(DISTINCT dbo.mypeople.name) AS Expr1
> FROM dbo.mypeople INNER JOIN
> dbo.myattributes ON dbo.mypeople.id_person =
> dbo.myattributes.id_person
> WHERE (dbo.myattributes.id_attribute IN (1002, 1004))
> HAVING (COUNT(DISTINCT dbo.myattributes.id_attribute) = 2)
> Notice that the count returned from this query is 3. Unless I'm not
> understanding the query logic correctly, I would have thought that only
two
> people met the IN/HAVING clause criteria. In this example, only John and
> Bill have attribute values of both 1002 and 1004. I would have thought
that[vbcol=seagreen]
> this
> query would have returned the value of 2 -- not 3 ?
>
> "Kalen Delaney" wrote:
give[vbcol=seagreen]
least[vbcol=seagreen]
of[vbcol=seagreen]
My[vbcol=seagreen]
the[vbcol=seagreen]
|||gstoa,
you are looking for relational division. You may want to look up this
topic on the internet. (or you can just use Kalen's suggested query).
What your current query does is first calculate how many people have
attribute 1002 *or* attribute 1004. At the same time it will calculate
the number of different attributes that could be found and matched
either 1002 or 1004.
The first calculation results in 3, the second in 2. Up to this point,
the resultset is one row with these two values.
Then, all rows that do not have an attribute-count of 2 are removed. In
this case no rows are removed.
Finally, the requested columns of the resultset are returned. This is
the first calculated value of 3.
Gert-Jan
gstoa wrote:
> Here's some sample data:
<snip>
> SELECT COUNT(DISTINCT dbo.mypeople.name) AS Expr1
> FROM dbo.mypeople INNER JOIN
> dbo.myattributes ON dbo.mypeople.id_person =
> dbo.myattributes.id_person
> WHERE (dbo.myattributes.id_attribute IN (1002, 1004))
> HAVING (COUNT(DISTINCT dbo.myattributes.id_attribute) = 2)
<snip>
(Please reply only to the newsgroup)

No comments:

Post a Comment