2024-02-26 21:25:46 | INFO | model_worker | args: Namespace(host='localhost', port=21002, worker_address='http://localhost:21002', controller_address='http://localhost:21001', model_path='MBZUAI/MobiLlama-05B-Chat', revision='main', device='cuda', gpus=None, num_gpus=1, max_gpu_memory=None, dtype=None, load_8bit=False, cpu_offloading=False, gptq_ckpt=None, gptq_wbits=16, gptq_groupsize=-1, gptq_act_order=False, awq_ckpt=None, awq_wbits=16, awq_groupsize=-1, enable_exllama=False, exllama_max_seq_len=4096, exllama_gpu_split=None, exllama_cache_8bit=False, enable_xft=False, xft_max_seq_len=4096, xft_dtype=None, model_names=None, conv_template=None, embed_in_truncate=False, limit_worker_concurrency=5, stream_interval=2, no_register=False, seed=None, debug=False, ssl=False) 2024-02-26 21:25:46 | INFO | model_worker | Loading the model ['MobiLlama-05B-Chat'] on worker 2869e009 ... 2024-02-26 21:25:50 | INFO | model_worker | Register to controller 2024-02-26 21:25:50 | ERROR | stderr | INFO: Started server process [453480] 2024-02-26 21:25:50 | ERROR | stderr | INFO: Waiting for application startup. 2024-02-26 21:25:50 | ERROR | stderr | INFO: Application startup complete. 2024-02-26 21:25:50 | ERROR | stderr | INFO: Uvicorn running on http://localhost:21002 (Press CTRL+C to quit) 2024-02-26 21:26:35 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-05B-Chat']. Semaphore: None. call_ct: 0. worker_id: 2869e009. 2024-02-26 21:27:15 | INFO | stdout | INFO: 127.0.0.1:54126 - "POST /worker_get_status HTTP/1.1" 200 OK 2024-02-26 21:27:20 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-05B-Chat']. Semaphore: None. call_ct: 0. worker_id: 2869e009. 2024-02-26 21:27:55 | INFO | stdout | INFO: 127.0.0.1:35858 - "POST /worker_generate_stream HTTP/1.1" 200 OK 2024-02-26 21:28:05 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-05B-Chat']. Semaphore: Semaphore(value=4, locked=False). call_ct: 1. worker_id: 2869e009. 2024-02-26 21:28:50 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-05B-Chat']. Semaphore: Semaphore(value=5, locked=False). call_ct: 1. worker_id: 2869e009. 2024-02-26 21:29:35 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-05B-Chat']. Semaphore: Semaphore(value=5, locked=False). call_ct: 1. worker_id: 2869e009. 2024-02-26 21:30:01 | INFO | stdout | INFO: 127.0.0.1:59612 - "POST /worker_get_status HTTP/1.1" 200 OK 2024-02-26 21:30:20 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-05B-Chat']. Semaphore: Semaphore(value=5, locked=False). call_ct: 1. worker_id: 2869e009. 2024-02-26 21:30:27 | INFO | stdout | INFO: 127.0.0.1:37232 - "POST /worker_generate_stream HTTP/1.1" 200 OK 2024-02-26 21:31:05 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-05B-Chat']. Semaphore: Semaphore(value=5, locked=False). call_ct: 2. worker_id: 2869e009. 2024-02-26 21:31:50 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-05B-Chat']. Semaphore: Semaphore(value=5, locked=False). call_ct: 2. worker_id: 2869e009. 2024-02-26 21:32:35 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-05B-Chat']. Semaphore: Semaphore(value=5, locked=False). call_ct: 2. worker_id: 2869e009. 2024-02-26 21:33:20 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-05B-Chat']. Semaphore: Semaphore(value=5, locked=False). call_ct: 2. worker_id: 2869e009. 2024-02-26 21:33:41 | INFO | stdout | INFO: 127.0.0.1:54940 - "POST /worker_get_status HTTP/1.1" 200 OK 2024-02-26 21:34:05 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-05B-Chat']. Semaphore: Semaphore(value=5, locked=False). call_ct: 2. worker_id: 2869e009. 2024-02-26 21:34:11 | INFO | stdout | INFO: 127.0.0.1:52316 - "POST /worker_generate_stream HTTP/1.1" 200 OK 2024-02-26 21:34:50 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-05B-Chat']. Semaphore: Semaphore(value=5, locked=False). call_ct: 3. worker_id: 2869e009. 2024-02-26 21:35:35 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-05B-Chat']. Semaphore: Semaphore(value=5, locked=False). call_ct: 3. worker_id: 2869e009. 2024-02-26 21:36:21 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-05B-Chat']. Semaphore: Semaphore(value=5, locked=False). call_ct: 3. worker_id: 2869e009. 2024-02-26 21:37:06 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-05B-Chat']. Semaphore: Semaphore(value=5, locked=False). call_ct: 3. worker_id: 2869e009. 2024-02-26 21:37:51 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-05B-Chat']. Semaphore: Semaphore(value=5, locked=False). call_ct: 3. worker_id: 2869e009. 2024-02-26 21:38:36 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-05B-Chat']. Semaphore: Semaphore(value=5, locked=False). call_ct: 3. worker_id: 2869e009. 2024-02-26 21:38:40 | INFO | stdout | INFO: 127.0.0.1:50000 - "POST /worker_get_status HTTP/1.1" 200 OK 2024-02-26 21:39:18 | INFO | stdout | INFO: 127.0.0.1:43456 - "POST /worker_generate_stream HTTP/1.1" 200 OK 2024-02-26 21:39:21 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-05B-Chat']. Semaphore: Semaphore(value=5, locked=False). call_ct: 4. worker_id: 2869e009. 2024-02-26 21:40:06 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-05B-Chat']. Semaphore: Semaphore(value=5, locked=False). call_ct: 4. worker_id: 2869e009. 2024-02-26 21:40:51 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-05B-Chat']. Semaphore: Semaphore(value=5, locked=False). call_ct: 4. worker_id: 2869e009. 2024-02-26 21:41:06 | INFO | stdout | INFO: 127.0.0.1:39444 - "POST /worker_get_status HTTP/1.1" 200 OK 2024-02-26 21:41:36 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-05B-Chat']. Semaphore: Semaphore(value=5, locked=False). call_ct: 4. worker_id: 2869e009. 2024-02-26 21:41:39 | INFO | stdout | INFO: 127.0.0.1:50026 - "POST /worker_generate_stream HTTP/1.1" 200 OK 2024-02-26 21:42:11 | ERROR | stderr | INFO: Shutting down 2024-02-26 21:42:11 | ERROR | stderr | INFO: Waiting for application shutdown. 2024-02-26 21:42:11 | ERROR | stderr | INFO: Application shutdown complete. 2024-02-26 21:42:11 | ERROR | stderr | INFO: Finished server process [453480]