Sunday, March 25, 2012

Count consecutive numbers

I'm trying extract a count of consecutive numbers, or "unbroken" years in
this case, at any particular given time.

For example (simplified):

CREATE TABLE #Customers
(
CustNo INT,
YearNo INT,
IsCust CHAR(1)
)

INSERT INTO #Customers (custno, yearno, isCust) VALUES (999, 2006, 'Y')
INSERT INTO #Customers (custno, yearno, isCust) VALUES (999, 2005, 'Y')
INSERT INTO #Customers (custno, yearno, isCust) VALUES (999, 2004, 'Y')
INSERT INTO #Customers (custno, yearno, isCust) VALUES (999, 2003, 'N')
INSERT INTO #Customers (custno, yearno, isCust) VALUES (999, 2002, 'N')
INSERT INTO #Customers (custno, yearno, isCust) VALUES (999, 2001, 'Y')
INSERT INTO #Customers (custno, yearno, isCust) VALUES (999, 2000, 'Y')

SELECT * FROM #Customers

CustNo YearNo IsCust
---- ---- --
999 2006 Y
999 2005 Y
999 2004 Y
999 2003 N
999 2002 N
999 2001 Y
999 2000 Y

In 2006 CustNo 999 would have been active for 3 years, 2004 for 1, 2001 for
2, etc. Ideally I'd feed it a single year to lookup

I'm resisting the urge to create cursor here -- anyone have any hints?

...Chris.ChrisD wrote:

> In 2006 CustNo 999 would have been active for 3 years, 2004 for 1, 2001
> for 2, etc. Ideally I'd feed it a single year to lookup

This works in Postgres, you'll have to change the "limit 1" to mssql TOP 1
syntax. Also note the hardcoded year on line 4, replace that with a
parameter.

Ironically, this only works if you specify the year. Without the year you
get spurious rows.

select a.yearno,b.yearno,(a.yearno - b.yearno) + 1 as "years"
from customers a join customers b on a.custno = b.custno
where a.yearno > b.yearno
AND a.yearno = 2006
AND a.isCust = 'Y' and b.isCust = 'Y'
and not exists
(
select yearno
FROM customers x
WHERE x.custno = a.custno
AND x.yearno between b.yearno AND a.yearno
AND x.isCust = 'N'
)
order by b.yearno
limit 1

--
Kenneth Downs
Secure Data Software, Inc.
(Ken)nneth@.(Sec)ure(Dat)a(.com)|||"ChrisD" <spambucket@.hotmail.com> wrote in message news:<Yfn2e.840846$Xk.593396@.pd7tw3no>...
> I'm trying extract a count of consecutive numbers, or "unbroken" years in
> this case, at any particular given time.
> For example (simplified):
> CREATE TABLE #Customers
> (
> CustNo INT,
> YearNo INT,
> IsCust CHAR(1)
> )
> INSERT INTO #Customers (custno, yearno, isCust) VALUES (999, 2006, 'Y')
> INSERT INTO #Customers (custno, yearno, isCust) VALUES (999, 2005, 'Y')
> INSERT INTO #Customers (custno, yearno, isCust) VALUES (999, 2004, 'Y')
> INSERT INTO #Customers (custno, yearno, isCust) VALUES (999, 2003, 'N')
> INSERT INTO #Customers (custno, yearno, isCust) VALUES (999, 2002, 'N')
> INSERT INTO #Customers (custno, yearno, isCust) VALUES (999, 2001, 'Y')
> INSERT INTO #Customers (custno, yearno, isCust) VALUES (999, 2000, 'Y')
> SELECT * FROM #Customers
8<-- Obvious result omitted
> In 2006 CustNo 999 would have been active for 3 years, 2004 for 1, 2001 for
> 2, etc. Ideally I'd feed it a single year to lookup
> I'm resisting the urge to create cursor here -- anyone have any hints?
> ...Chris.

The computation you want to perform is a subtraction.
There are some caveats concernig your data.

Just two hints ...|||how about this:

select top 1 max(a.yearno)+1 as from_ ,b.yearno as to_ , b.yearno -(
max(a.yearno)+1) as consecutive_time from #customers a join #Customers
b on a.custno = b.custno and
a.iscust='N' and b.iscust='Y' and a.yearno < b.yearno
group by b.yearno
order by consecutive_time desc

i.e. get the max diff between an 'N' and the 'Y' after it|||create view cust as
select custno, yearno, isCust from Customers
union
select custno, min(yearno) - 1, 'N'
from Customers group by custno
go

select custno,yearno, iscust,
case iscust
when 'N' THEN 0
ELSE 1+(select count(*)
from cust a where a.custno = b.custno and
a.yearno < b.yearno and
(a.yearno >
(select max(yearno) from cust c where iscust = 'N' and yearno <
b.yearno and custno = b.custno))
) end as active_for
from cust b
where yearno >= (select min(yearno) from customers x where x.custno =
b.custno )
order by custno, yearno|||"ChrisD" <spambucket@.hotmail.com> wrote in message news:Yfn2e.840846$Xk.593396@.pd7tw3no...
> I'm trying extract a count of consecutive numbers, or "unbroken" years in
> this case, at any particular given time.
> For example (simplified):
> CREATE TABLE #Customers
> (
> CustNo INT,
> YearNo INT,
> IsCust CHAR(1)
> )
> INSERT INTO #Customers (custno, yearno, isCust) VALUES (999, 2006, 'Y')
> INSERT INTO #Customers (custno, yearno, isCust) VALUES (999, 2005, 'Y')
> INSERT INTO #Customers (custno, yearno, isCust) VALUES (999, 2004, 'Y')
> INSERT INTO #Customers (custno, yearno, isCust) VALUES (999, 2003, 'N')
> INSERT INTO #Customers (custno, yearno, isCust) VALUES (999, 2002, 'N')
> INSERT INTO #Customers (custno, yearno, isCust) VALUES (999, 2001, 'Y')
> INSERT INTO #Customers (custno, yearno, isCust) VALUES (999, 2000, 'Y')
> SELECT * FROM #Customers
> CustNo YearNo IsCust
> ---- ---- --
> 999 2006 Y
> 999 2005 Y
> 999 2004 Y
> 999 2003 N
> 999 2002 N
> 999 2001 Y
> 999 2000 Y
> In 2006 CustNo 999 would have been active for 3 years, 2004 for 1, 2001 for
> 2, etc. Ideally I'd feed it a single year to lookup
> I'm resisting the urge to create cursor here -- anyone have any hints?
> ...Chris.

SELECT C.CustNo AS CustNo,
C.YearNo AS YearNo,
C.YearNo - MAX(FY.YearNo) + 1 AS YearTally
FROM #Customers AS C
INNER JOIN
(SELECT C1.CustNo, C1.YearNo
FROM #Customers AS C1
LEFT OUTER JOIN
#Customers AS C2
ON C1.CustNo = C2.CustNo AND
C2.YearNo = C1.YearNo - 1 AND
C2.IsCust = 'Y'
WHERE C1.IsCust = 'Y' AND C2.CustNo IS NULL) AS FY -- 1st year
ON FY.CustNo = C.CustNo AND
C.IsCust = 'Y' AND
FY.YearNo <= C.YearNo
GROUP BY C.CustNo, C.YearNo
ORDER BY CustNo, YearNo

--
JAG|||"ChrisD" <spambucket@.hotmail.com> wrote in message news:<Yfn2e.840846$Xk.593396@.pd7tw3no>...
> I'm trying extract a count of consecutive numbers, or "unbroken" years in
> this case, at any particular given time.
> For example (simplified):
> CREATE TABLE #Customers
> (
> CustNo INT,
> YearNo INT,
> IsCust CHAR(1)
> )
8<----Big snip
> In 2006 CustNo 999 would have been active for 3 years, 2004 for 1, 2001 for
> 2, etc. Ideally I'd feed it a single year to lookup
> I'm resisting the urge to create cursor here -- anyone have any hints?
> ...Chris.

As I said in my previous posting:
The computation you want to perform is a subtraction.
There are some caveats concernig your data.

Chris wanted hints, not complete solutions.
The solutions offered are likely to fail (I didn't test this)
if there is an 'active' year without any 'inactive' predecessor.|||"Theo Peterbroers" <peterbroers@.floron.leidenuniv.nl> wrote in message
news:39bb2c10.0503300659.231f1c7c@.posting.google.c om...
> "ChrisD" <spambucket@.hotmail.com> wrote in message news:<Yfn2e.840846$Xk.593396@.pd7tw3no>...
> > I'm trying extract a count of consecutive numbers, or "unbroken" years in
> > this case, at any particular given time.
> > For example (simplified):
> > CREATE TABLE #Customers
> > (
> > CustNo INT,
> > YearNo INT,
> > IsCust CHAR(1)
> > )
> 8<----Big snip
> > In 2006 CustNo 999 would have been active for 3 years, 2004 for 1, 2001 for
> > 2, etc. Ideally I'd feed it a single year to lookup
> > I'm resisting the urge to create cursor here -- anyone have any hints?
> > ...Chris.
> As I said in my previous posting:
> The computation you want to perform is a subtraction.
> There are some caveats concernig your data.
> Chris wanted hints, not complete solutions.

I didn't take that as his literal intention. If it was, a quick glance
will reveal a solution, but probably not lead to comprehension,
and he can choose to ignore it.

> The solutions offered are likely to fail (I didn't test this)
> if there is an 'active' year without any 'inactive' predecessor.

His sample data includes an active year without an inactive
predecessor. As Chris was helpful enough to include DDL
and sample data, I assume all respondents who offered
complete solutions availed themselves of it. As far as I
can tell, my solution solves the problem.

--
JAG|||Yet another version, with a little-used predicate!

SELECT X.cust_nbr, MIN(X.start_date) AS start_date, X.end_date
FROM (SELECT C1.cust_nbr, C1.cust_year, MAX(C2.cust_year)
FROM Customers AS C1, Customers AS C2
WHERE C1.cust_nbr = C2.cust_nbr
AND C1.cust_year <= C2.cust_year
AND 'Y' = ALL (SELECT cust_flag
FROM Customers AS C3
WHERE C3.cust_nbr = C2.cust_nbr
AND C3.cust_year BETWEEN C1.cust_year AND
C2.cust_year)
GROUP BY C1.cust_nbr, C1.cust_year)
AS X(cust_nbr, start_year, end_year)
GROUP BY X.cust_nbr, X.end_date;|||Opps! fix my typos:

SELECT X.cust_nbr, MIN(X.start_year) AS start_date, X.end_year
FROM (SELECT C1.cust_nbr, C1.cust_year, MAX(C2.cust_year)
FROM Customers AS C1, Customers AS C2
WHERE C1.cust_nbr = C2.cust_nbr
AND C1.cust_year <= C2.cust_year
AND 'Y' = ALL (SELECT cust_flag
FROM Customers AS C3
WHERE C3.cust_nbr = C2.cust_nbr
AND C3.cust_year BETWEEN C1.cust_year AND
C2.cust_year)
GROUP BY C1.cust_nbr, C1.cust_year)
AS X(cust_nbr, start_year, end_year)
GROUP BY X.cust_nbr, X.end_year;|||ChrisD wrote:

> In 2006 CustNo 999 would have been active for 3 years, 2004 for 1,
> 2001 for 2, etc. Ideally I'd feed it a single year to lookup

Thanks all for the nudges!

I was able to make this work using a combination of John's and Kenneth's
samples. Joe's works too.

In practice I will always have a previous year -- but I suppose it's a
always a good idea to check.

...Chris.

No comments:

Post a Comment