In normally, Python is running under single-process, single-thread, and single routine.
We could use some moudules to control them, like:multiprocessing
to control processthreading
to control threadasyncio
to control coroutine
The following article introduces some application of coroutine:
Synchro
A simple example of time.sleep for running five times with sleep time = 1 sec.
It takes 5 sec.
import timestart_time = time.time()def sleep_sec(sec): print('start at: ', time.time() - start_time) time.sleep(sec) print('end at: ', time.time() - start_time) def main():ㄅ for i in range(5): sleep_sec(1) print('end of main: ', time.time() - start_time)main()
Output
start at: 0.0001671314239501953end at: 1.0058751106262207start at: 1.005995512008667end at: 2.006303310394287start at: 2.0064895153045654end at: 3.0076138973236084start at: 3.007782220840454end at: 4.0089030265808105start at: 4.0090556144714355end at: 5.009898662567139end of main: 5.010054588317871
Asyncio
Using asyncio, the end time become about 1 sec which running a time cost.
import timeimport asynciostart_time = time.time()async def sleep_sec(sec): print('start at: ', time.time() - start_time) await asyncio.sleep(sec) print('end at: ', time.time() - start_time async def main(): tasks = list() for i in range(5): tasks.append(asyncio.create_task(sleep_sec(1))) for task in tasks: await task print('end of main: ', time.time() - start_time)await main()
start at: 9.965896606445312e-05start at: 0.0002307891845703125start at: 0.0002894401550292969start at: 0.00034356117248535156start at: 0.000396728515625end at: 1.0007236003875732end at: 1.0008800029754639end at: 1.000946283340454end at: 1.0010099411010742end at: 1.001070261001587end of main: 1.001145601272583
Another similar example.
Use task to run it in the side and keep running for following code at the same time.
It no need to use await task
except need to use the output of the task.
import timeimport asynciostart_time = time.time()async def sleep_sec(sec): print('start at: ', time.time() - start_time) print('input sec: ', sec) await asyncio.sleep(sec) print('end at: ', time.time() - start_time) async def main(): task = asyncio.create_task(sleep_sec(1)) print('asyncio.sleep(3)') await asyncio.sleep(3) # await task print('end of main: ', time.time() - start_time)await main()
asyncio.sleep(3)start at: 0.0004246234893798828input sec: 1end at: 1.0014164447784424end of main: 3.002758264541626
Requests
In most time to use coroutine is for web-crawling, because it would be very inefficient when the requests are quite long or lots of.
For a very simple example to send a requests.get five times.
import timeimport requestsstart_time = time.time()def req_get(): url = "https://www.google.com" print('start at: ', time.time() - start_time) r = requests.get(url) print('end at: ', time.time() - start_time)def main(): for i in range(5): req_get() print('end of main: ', time.time() - start_time)main()
start at: 0.00021576881408691406end at: 0.0635523796081543start at: 0.06368589401245117end at: 0.13397932052612305start at: 0.13453364372253418end at: 0.20851564407348633start at: 0.2091212272644043end at: 0.28153157234191895start at: 0.2821779251098633end at: 0.35675716400146484end of main: 0.35712122917175293
Async Requests
From the output we could know that it continually sends requests instead of waiting the response and then send next requests. So, the end time is less than above for one over four times.
import timeimport requestsimport asynciostart_time = time.time()async def req_get(): url = "https://www.google.com" print('start at: ', time.time() - start_time) r = await loop.run_in_executor(None, requests.get, url) print('end at: ', time.time() - start_time)async def main(): tasks = list() for i in range(5): tasks.append(asyncio.create_task(req_get())) for task in tasks: await task print('end of main: ', time.time() - start_time)await main()
start at: 0.0001125335693359375start at: 0.0011854171752929688start at: 0.004603147506713867start at: 0.006804466247558594start at: 0.009021997451782227end at: 0.06898927688598633end at: 0.08475136756896973end at: 0.08681845664978027end at: 0.08908772468566895end at: 0.09105420112609863end of main: 0.09117269515991211
Update
For requests
module, there is another useful modeul aiohttp
.
# import nest_asyncio# nest_asyncio.apply()import timeimport aiohttpimport asynciostart_time = time.time()async def fetch(client): async with client.get("https://www.google.com") as resp: assert resp.status == 200 return await resp.text() async def req_get(): print('start at: ', time.time() - start_time) async with aiohttp.ClientSession() as s: r = await fetch(s) print('end at: ', time.time() - start_time) tasks = [req_get() for _ in range(5)]loop = asyncio.get_event_loop()loop.run_until_complete(asyncio.wait(tasks))
start at: 0.0004086494445800781start at: 0.007526874542236328start at: 0.008102178573608398start at: 0.008558034896850586start at: 0.009565353393554688end at: 0.06927990913391113end at: 0.07057762145996094end at: 0.07248950004577637end at: 0.0732574462890625end at: 0.07469344139099121
aiohttp
is a little faster than requests
.