>> We have an order processing database which includes the standard
Order
Header/Order Lines tables. <<
Please post DDL, so that people do not have to guess what the keys,
constraints, Declarative Referential Integrity, datatypes, etc. in
your schema are. Sample data is also a good idea, along with clear
specifications. My guess is that the order details looks like this:
CREATE TABLE OrderDetails
(order_nbr INTEGER NOT NULL
REFERENCES Orders(order_nbr)
ON UPDATE CASCADE
ON DELETE CASCADE,
upc CHAR(10) NOT NULL -- or other industry code
REFERENCES Inventory(upc)
ON UPDATE CASCADE
ON DELETE CASCADE,
qty INTEGER NOT NULL,
...);
I'm trying to write a query to get the average number of lines per
order over a given period, and I'm stuck :-( <<
I hope that you mean the average number of **items** per order over a
given period. The way you said this would imply that you are copying
the input form (where lines are PHYSICAL), and have no proper LOGICAL
model. If I order my eggs one at a time, I get 12 lines; If I map the
order into a LOGICAL row, I get one row with a quantity of 12 eggs
(and then I can look for packaging units, discounts, etc.)
SELECT AVG(X.tally) AS avg_per_order
FROM (SELECT COUNT(*)
FROM OrderDetails AS D1
WHERE D1.order_nbr
IN (SELECT O1.order_nbr
FROM Orders AS O1
WHERE O1.order_dt
BETWEEN @my_start_dateAND @my_end_date)
GROUP BY order_nbr) AS X(tally);
The trick for aggregates inside aggregates is to use a derived table
for the lower level aggregations, then nest them outward.