最近要去当兵,所以在进去前多少学一下资料库。
SELECT COUNT( * ) FROM table_name: 能够计算资料之总数
计算Software Engineer工程师的数量。
mysql> SELECT COUNT(*) FROM employee -> WHERE title="Software Engineer";+----------+| COUNT(*) |+----------+| 3 |+----------+1 row in set (0.01 sec)
SELECT DISTINCT column_name FROM table_name; 可以区分资料,将重複资料过滤并做分类。
将title做分类。
mysql> SELECT DISTINCT title FROM employee;+------------------------+| title |+------------------------+| Software Engineer || Software Architect || Database Administrator || Project Manager || Test Engineer |+------------------------+5 rows in set (0.01 sec)
搭配count,计算总共有几种title。
mysql> SELECT count(DISTINCT title) FROM employee;+-----------------------+| count(DISTINCT title) |+-----------------------+| 5 |+-----------------------+1 row in set (0.01 sec)
SELECT column_name FROM table_name GROUP BY column_name ; :跟DISTINCT类似,而不同在GROUP BY去除重複值后,将资料(多栏位)分类并组成一个群组。
以title,last_name去分类并组成群组,可发现全部都各为一个群组,共分为9组。
mysql> SELECT title, last_name,count(title) FROM employee -> GROUP BY title,last_name;+------------------------+-----------+--------------+| title | last_name | count(title) |+------------------------+-----------+--------------+| Database Administrator | Clifford | 1 || Database Administrator | Dickens | 1 || Project Manager | Clifford | 1 || Software Architect | Edward | 1 || Software Architect | Gilbert | 1 || Software Engineer | Clifford | 1 || Software Engineer | Jackman | 1 || Software Engineer | Newman | 1 || Test Engineer | Chan | 1 |+------------------------+-----------+--------------+9 rows in set (0.00 sec)
以title去分类并组成群组,共为5组,且有些组不只一笔资料。
mysql> SELECT title, last_name,count(title) FROM employee -> GROUP BY title;+------------------------+-----------+--------------+| title | last_name | count(title) |+------------------------+-----------+--------------+| Database Administrator | Dickens | 2 || Project Manager | Clifford | 1 || Software Architect | Edward | 2 || Software Engineer | Jackman | 3 || Test Engineer | Chan | 1 |+------------------------+-----------+--------------+5 rows in set (0.00 sec)
而如果只是需要过滤掉重複的资料,那么可以使用DISTINCT且比较不占用效能。
但如果非纯粹去重複,且不考虑效能问题,且GROUP BY也可以处理较複杂的逻辑,所以建议尽量使用GROUP BY
。
SELECT MAX(column_name) FROM table_name; 可以显示最高数值的资料。
而我们也可以搭配GROUP BY,在以title区分为不同群组后,对不同组的资料取其最高之薪水。
mysql> SELECT title,max(salary) FROM employee -> GROUP BY title;+------------------------+-------------+| title | max(salary) |+------------------------+-------------+| Database Administrator | 6800 || Project Manager | 8500 || Software Architect | 8000 || Software Engineer | 5500 || Test Engineer | 6500 |+------------------------+-------------+5 rows in set (0.00 sec)
SELECT MAX(column_name) FROM table_name; 可以显示最低数值的资料。
mysql> SELECT title,min(salary) FROM employee -> GROUP BY title;+------------------------+-------------+| title | min(salary) |+------------------------+-------------+| Database Administrator | 6000 || Project Manager | 8500 || Software Architect | 7200 || Software Engineer | 4750 || Test Engineer | 6500 |+------------------------+-------------+5 rows in set (0.01 sec)
SELECT SUM(column_name) FROM table_name; 可以显示资料数值之总和。
计算出各群组的薪水总和。
mysql> SELECT title,sum(salary) FROM employee -> GROUP BY title;+------------------------+-------------+| title | sum(salary) |+------------------------+-------------+| Database Administrator | 12800 || Project Manager | 8500 || Software Architect | 15200 || Software Engineer | 15350 || Test Engineer | 6500 |+------------------------+-------------+5 rows in set (0.00 sec)
SELECT AVG(column_name) FROM table_name; 可以显示资料数值之平均值。
计算出各群组的薪水总和及平均值。
mysql> SELECT title -> ,SUM(salary) -> ,AVG(salary) -> FROM employee -> GROUP BY title;+------------------------+-------------+-------------------+| title | SUM(salary) | AVG(salary) |+------------------------+-------------+-------------------+| Database Administrator | 12800 | 6400 || Project Manager | 8500 | 8500 || Software Architect | 15200 | 7600 || Software Engineer | 15350 | 5116.666666666667 || Test Engineer | 6500 | 6500 |+------------------------+-------------+-------------------+5 rows in set (0.01 sec)
SELECT column_name , count( * ) , AVG(salary) FROM table_name GROUP BY column_name HAVING title= "xx"; :当我们想过滤GROUP BY后的资料,就可以使用HAVING
与where不同在于,where是对GROUP BY之前的资料进行过滤,也就是全部的TABLE,而HAVING是针对GROUP BY之后的资料进行过滤。但大多还是使用where居多。
过滤GROUP BY之后指定title为Software Engineer的资料。
mysql> SELECT title, -> count(*), -> AVG(salary) -> FROM employee -> GROUP BY title -> HAVING title="Software Engineer";+-------------------+----------+-------------------+| title | count(*) | AVG(salary) |+-------------------+----------+-------------------+| Software Engineer | 3 | 5116.666666666667 |+-------------------+----------+-------------------+1 row in set (0.01 sec)
最后来个综合练习
Q1:求TOP5 历史票房最高的导演?
将导演去重複并分组后,依照票房高低作排序(降幂),限制在五笔资料。
mysql> SELECT -> director_name, -> SUM(gross) -> FROM `movie` -> GROUP BY -> director_name -> ORDER BY -> SUM(gross) DESC -> LIMIT -> 5;+-------------------+-------------+| director_name | SUM(gross) |+-------------------+-------------+| Christopher Nolan | 38012181818 || James Cameron | 36898631874 || Joss Whedon | 28139049796 || Peter Jackson | 26897421538 || Michael Bay | 25996453730 |+-------------------+-------------+5 rows in set (0.00 sec)
Q2:TOP5 拍过最多电影的导演?
mysql> SELECT -> director_name, -> count(director_name) -> FROM `movie` -> GROUP BY -> director_name -> ORDER BY -> count(director_name) DESC -> LIMIT -> 5 -> -> ;+-------------------+----------------------+| director_name | count(director_name) |+-------------------+----------------------+| Christopher Nolan | 84 || Peter Jackson | 84 || Bryan Singer | 84 || Gore Verbinski | 63 || Sam Raimi | 63 |+-------------------+----------------------+5 rows in set (0.00 sec)