Text Generation
Transformers
Safetensors
mixtral
conversational
text-generation-inference
Inference Endpoints
NickyNicky commited on
Commit
36997c8
1 Parent(s): 28d940e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +73 -0
README.md CHANGED
@@ -1,3 +1,76 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+
6
+ from transformers import (
7
+ AutoModelForCausalLM,
8
+ AutoTokenizer,
9
+ BitsAndBytesConfig,
10
+ HfArgumentParser,
11
+ TrainingArguments,
12
+ pipeline,
13
+ logging,
14
+ GenerationConfig,
15
+ TextIteratorStreamer,
16
+ )
17
+
18
+ from attention_sinks import AutoModelForCausalLM
19
+
20
+ import torch
21
+
22
+ # model_id = 'Open-Orca/Mistral-7B-OpenOrca'
23
+ model_id='NickyNicky/Mistral-7B-OpenOrca-oasst_top1_2023-08-25-v3'
24
+
25
+ model = AutoModelForCausalLM.from_pretrained(model_id,
26
+ device_map="auto",
27
+ trust_remote_code=True,
28
+ torch_dtype=torch.bfloat16,
29
+ load_in_4bit=True,
30
+ low_cpu_mem_usage= True,
31
+ #use_flash_attention_2=True, #GPU A100 or GPU supported
32
+
33
+ attention_sink_size=4,
34
+ attention_sink_window_size=1024, #512, # <- Low for the sake of faster generation
35
+ )
36
+
37
+ max_length=2048
38
+ print("max_length",max_length)
39
+
40
+
41
+ tokenizer = AutoTokenizer.from_pretrained(model_id,
42
+ # use_fast = False,
43
+ max_length=max_length,)
44
+
45
+ tokenizer.pad_token = tokenizer.eos_token
46
+ tokenizer.padding_side = 'right'
47
+
48
+ #EXAMPLE #1
49
+ txt="""<|im_start|>user
50
+ I'm looking for an efficient Python script to output prime numbers. Can you help me out? I'm interested in a script that can handle large numbers and output them quickly. Also, it would be great if the script could take a range of numbers as input and output all the prime numbers within that range. Can you generate a script that fits these requirements? Thanks!<|im_end|>
51
+ <|im_start|>assistant
52
+ """
53
+
54
+ #EXAMPLE #2
55
+ txt="""<|im_start|>user
56
+ Estoy desarrollando una REST API con Nodejs, y estoy tratando de aplicar algún sistema de seguridad, ya sea con tokens o algo similar, me puedes ayudar?<|im_end|>
57
+ <|im_start|>assistant
58
+ """
59
+
60
+ inputs = tokenizer.encode(txt, return_tensors="pt").to("cuda")
61
+
62
+ generation_config = GenerationConfig(
63
+ max_new_tokens=max_new_tokens,
64
+ temperature=0.7,
65
+ top_p=0.9,
66
+ top_k=len_tokens,
67
+ repetition_penalty=1.11,
68
+ do_sample=True,
69
+ # pad_token_id=tokenizer.eos_token_id,
70
+ # eos_token_id=tokenizer.eos_token_id,
71
+ # use_cache=True,
72
+ # stopping_criteria= StoppingCriteriaList([stopping_criteria]),
73
+ )
74
+ outputs = model.generate(generation_config=generation_config,
75
+ input_ids=inputs,)
76
+ tokenizer.decode(outputs[0], skip_special_tokens=False) #True