0%

kafka的基本命令

Kafka是由Apache软件基金会开发的一个开源流处理平台,由Scala和Java编写。Kafka是一种高吞吐量的分布式发布订阅消息系统,它可以处理消费者规模的网站中的所有动作流数据。


基本命令使用

列出所有可用的topic

1
./bin/kafka-topics.sh --list --zookeeper 172.20.40.51:2181

新建topic命令

1
./bin/kafka-topics.sh -zookeeper 172.20.40.51:2181 -topic seclogs -replication-factor 1 -partitions 3 -create

删除topic

1
./bin/kafka-topics.sh --delete --zookeeper 172.20.40.51:2181 --topic seclogs

查看kafka数据

1
./bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --broker-info --group yunwei-logstash --topic seclogs --zookeeper 172.20.40.51:2181

前端三种路由配置

  • 第一种

    1
    2
    3
    4
    5
    location /anti-fraud {
    alias /opt/welab/anti-fraud;
    index index.html index.htm;
    error_page 404 /anti-fraud/index.html;
    }
  • 第二种

    1
    2
    3
    4
    5
    location /financial-life {
    rewrite /financial-life/(.+) /$1 break;
    root /opt/welab/financial-life;
    index index.html index.htm;
    }
  • 第三种

    1
    2
    3
    4
    5
    location /sjd {
    alias /opt/welab/sjd;
    index index.html index.htm;
    try_files $uri /sjd/login /sjd/main.html;
    }

跨域配置

1
2
3
add_header Access-Control-Allow-Origin *;
add_header Access-Control-Allow-Headers content-type,x-user-token;
add_header Access-Control-Allow-Methods GET,POST;

python学习笔记

[toc]

推导式

推导式是从一个或者多个迭代器快速简洁地创建数据结构的一种方法。它可以将循环和条件判断结合,从而避免语法冗长的代码。

列表推导式
1
2
3
number_list = [number for number in range(1,6) if number % 2 == 1]
print(number_list)
[1,3,5]
字典推导式
1
2
3
4
word = 'letters'
letter_counts = {letter:word.count(letter) for letter in set(word)}
print(letter_counts)
{'t': 2, 'l': 1, 'e': 2, 'r': 1, 's': 1}
集合推导式
1
2
3
a_set = {number for number in range(1,6) if number % 3 == 1}
print(a_set)
{1, 4}
生成器函数

生成器函数:函数体含有yield关键字,生成器就是迭代器

1

装饰器

装饰器的功能是在不修改被装饰对象源代码以及调用方式的前提下,为其添加新功能。

  • res是一个用户密码认证的装饰器,index要运行的函数,用装饰器扩展功能后,运行index可以增加用户认证的功能。
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    import time
    import random
    import sys

    user_dit={
    'user01':'python01',
    'user02':'python02',
    'user03':'python03',
    }

    login_status = {
    'user':None,
    'login':False,
    }

    with open('auth_info.txt','w',encoding='utf-8') as f:
    f.seek(0)
    f.truncate()
    f.write(str(user_dit))

    db_file = 'auth_info.txt'
    def res(app):
    def auth(*args,**kwargs):
    if login_status['user'] and login_status['login'] :
    user_name = login_status['user']
    app(user_name, **kwargs)
    else:
    user_name = input('user name: ')
    user_passwd = input('user passwd: ')
    with open(db_file, 'r', encoding='utf-8') as f:
    f_out = eval(f.read())
    if user_name in f_out and user_passwd == f_out[user_name]:
    print('login seccess')
    login_status['user'] = user_name
    login_status['login'] = True
    app(user_name,**kwargs)
    else:
    print('username or passwd error')
    sys.exit()
    return auth

    @res # index = res(index)
    def index(user_name,**kwargs):
    time.sleep(random.randrange(1,5))
    print('welecome to %s index page!' %user_name)

    index()

    index() # 第一次登录成功后,后面再次操作不需要输入密码

迭代器

1
2
res = [ num for num in range(6) ]
print(res)

生成器

函数体内包含yield关键字,运行时遇到yield暂停完成一次迭代,生成器就是迭代器。

生成器模拟linux命令

tail -f filename | grep word

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import time

def tail(file_name):
with open(file_name,encoding='utf-8') as f:
f.seek(0,2)
while True:
line = f.readline()
if line:
yield line
else:
time.sleep(0.5)

def grep(lines,word):
for i in lines:
if word in i:
print(i)

g = tail('a.txt')
grep(g,'error')
生成器表达式
1
2
3
res = (number for number in range(1,6) if number % 2 == 1)
for i in res:
print(i)
表达式形式的yield
1
2
3
4
5
6
7
8
9
10
11
12
13
14
def init(func):
def res(*args,**kwargs):
g = func(*args,**kwargs)
next(g)
return g
return res

init
def foo():
while True:
x = yield
print(x)

g = foo()

三元表达式

可以缩写if判断

1
2
3
4
5
6
7
8
9
def max(x,y):
# if x > y:
# return x
# else:
# return y

return x if x > y else y

print(max(1,2))

匿名函数lambda

1
2
res = lambda x,y:x*y
print(res(2,3))

递归调用

1
2
3
4
5
6
def age(n):
if n == 5:
return 18
return age(n+1)+2

print(age(1))

爬虫工具

selenium

pip install selenium安装,需要下载chrome驱动。

1
2
3
4
5
6
import selenium
from selenium import webdriver

driver = webdriver.Chrome()
driver.get('https://www.baidu.com')
print(driver.page_source)
phantomjs

phantomjs下载页,不需要打开浏览器,静默爬虫。

1
2
3
4
5
6
import selenium
from selenium import webdriver

driver = webdriver.PhantomJS()
driver.get('https://www.baidu.com')
print(driver.page_source)
pyquery

pip install pyquery

1
2
3
4
from pyquery import PyQuery as pq

doc = pq('<html>Hello World!</html>')
print(doc('html').text())
jupyter

可以web运行python命令,写笔记。

命令行传参

1
2
3
import sys

print('Program agruments: ',sys.argv)

正则re

查找findall
1
2
3
import re
r = re.compile("a\d")
res = r.findall("a1,ab")
分组( )
1
2
3
4
import re
r = re.compile("(?P<year>20[0|1]\d)-(?P<month>\d{1,2})")
res = r.search("2018-03")
print(res.group("year"))
爬虫豆瓣电影排名
1
2
3
4
5
6
7
8
9
10
11
12
13
14
import requests,re

def getpage():
res = requests.get("https://movie.douban.com/top250?start=25&filter=")
return res.text

def run():
ret = getpage()
r = re.compile('<div class="item">.*?<em.*?>(\d+)</em>?.*?<div class="info">.*?<span class="title">(.*?)</span>?.*?<div class="star">.*?<span>(\d+)人评价</span>',re.S)
res = r.findall(ret)
for item in res:
print(item)

run()

configparser模块(ini文件)

1
2
3
4
5
6
7
8
9
import configparser

cfg = configparser.ConfigParser()
# DEFAULT全局配置
cfg["DEFAULT"] = {"Access":"ReadWrite","Connect":"DSN=AdvWorks"}
cfg["USER"] = {"Name":"User"}

with open("cfg.ini","w") as f:
cfg.write(f)
1
2
3
4
5
6
7
import configparser

cfg = configparser.ConfigParser()
cfg.read("cfg.ini")

print(cfg["DEFAULT"]["Access"])
print(cfg.items("USER"))

subprocess模块(子进程执行linux命令)

1
2
3
4
5
import subprocess

s = subprocess.Popen("dir",shell=True)
s.wait()
print("ending")
1
2
3
4
import subprocess

s = subprocess.Popen("dir",shell=True,stdout=subprocess.PIPE)
print(s.stdout.read().decode("gbk"))

class,__init__方法

1
2
3
4
5
6
7
8
9
10
11
12
class Chinese:
country = "china"

def __init__(self,name,age):
self.name = name
self.age = age

def work(self):
print("work")

p1 = Chinese("user01","22")
p2 = Chinese("user02","23")

pyhton代码规范检查

pycodestyle检查代码是否符合pep 8规范

python官方提供了检查pep8代码规范的命令行工具pycodestyle,改工具可以检查python代码是否违反pep 8规范,并对违反的地方给出提示。

1
2
3
4
5
6
7
pip install pycodestyle

# 对python代码检查并打印检查报告
pycodestyle --first test.py

# --show-source显示不规范的源码
pycodestyle --show-source --show-pep8 test.py
使用autopep8将代码格式化

autopep8是一个开源的命令行工具,他能够将python代码自动格式化为pep8风格。autopep8使用pycodestyle来决定哪部分代码需要格式化。

1
2
3
4
pip install autopep8

# 代码自动格式化pep8规范
autopep8 --in-place test.py

读取标准输出

cat fileinput.py

1
2
3
4
import fileinput

for line in fileinput.input():
print(line, end="")
1
python fileinput.py /etc/passwd /etc/hosts

类的继承

继承是创建新的类的一种方式,继承可以减少重复代码。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class people:
def __init__(self,name,age):
self.name = name
self.age = age
def walk(self):
print('%s is walking' %self.name)

class teacher(people):
school = "abc"
def __init__(self,name,age,sex):
people.__init__(self,name,age)
self.sex = sex
def teach(self):
print('%s is teaching' %self.name)

class student(people):
def __init__(self,name,age,level):
people.__init__(self,name,age)
self.level = level
def study(self):
print('%s is studying' %self.name)

t = teacher('egon','18','man') # 实例化
print(t.name,t.age,t.school,t.sex)
super
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class people:
def __init__(self,name,age):
self.name = name
self.age = age
def walk(self):
print('%s is walking' %self.name)

class teacher(people):
school = "abc"
def __init__(self,name,age,sex):
super().__init__(name,age)
self.sex = sex
def teach(self):
print('%s is teaching' %self.name)

t = teacher('egon','18','man') # 实例化
print(t.name,t.age,t.school,t.sex)

类的组合

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class date():
def __init__(self,year,mon,day):
self.year = year
self.mon = mon
self.day = day
def tell_date(self):
print('今天的日期是:%s年-%s月-%s日' %(self.year,self.mon,self.day))

class teacher:
def __init__(self,name,age,year,mon,day):
self.name = name
self.age = age
self.date = date(year,mon,day)
def teach(self):
print('{} is teaching'.format(self.name))

t = teacher('user01',18,2018,22,12)
print(t.name,t.age,t.date.year,t.date.mon,t.date.day)

归一化接口

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
import abc

class Animal(metaclass=abc.ABCMeta):
@abc.abstractmethod
def talk(self):
pass

class people(Animal):
def talk(self):
print('say hello')

class dog(Animal):
def talk(self):
print('wang wang wang')

d = dog()
d.talk()

绑定方法与非绑定方法

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
import abc
import hashlib
import time

class Animal(metaclass=abc.ABCMeta):
@abc.abstractmethod
def talk(self):
pass

class people(Animal):
def talk(self):
print('say hello')

@classmethod
def func(cls):
print('say func')

@staticmethod
def sum(x,y):
print(x+y)

# 绑定到函数的方法
p = people()
p.talk()
# 绑定到类的方法
people.func()
# 非绑定方法
people.sum(1,2)

反射

hasattr查看对象是否有某个属性
getattr查看对象某个属性的值
setattr 设置对象某个属性的值
1
2
3
4
5
6
7
8
9
class foo:
say = "hello"
def __init__(self,name,age):
self.name = name
self.age = age

print(hasattr(foo,"say"))
setattr(foo,"say","hello world")
print(getattr(foo,"say"))

面向对象高级用法

__str__可以替换内存地址信息为别的信息
1
2
3
4
5
6
7
8
9
10
class foo:
def __init__(self,name,age):
self.name = name
self.age = age

def __str__(self):
return "{}:{}".format(self.name,self.age)

f = foo("user",18)
print(f)
__del__程序执行完后的操作,执行清理操作
1
2
3
4
5
6
7
8
9
class foo:
def __init__(self,name,age):
self.name = name
self.age = age

def __del__(self):
print("程序运行完成")

f = foo("user",18)
类使用[ ]取值
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
class foo:
def __init__(self,name,age):
self.name = name
self.age = age

def __getitem__(self, item):
return getattr(self,item)

def __setitem__(self, key, value):
setattr(self,key,value)

def __delitem__(self, key):
delattr(self,key)

f = foo("user",18)
print(f.__dict__)
f["name"] = "user01"
print(f.__dict__)
del f["name"]
print(f.__dict__)

socket编程

tcp socket模拟ssh

server

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
import socket
import subprocess

sock = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
sock.bind(("",9999))
sock.listen(1)

while True:
conn,addr = sock.accept() #等待、阻塞
print("got a new customer",conn,addr)

while True:
data = conn.recv(1024) #等待 bytes
print("received",data)
if not data: #客户端断开了
print("conn %s is lost" % str(addr) )
conn.close() #清除了跟这个断开的链接的实例对象
break
#conn.send(data.upper() )
cmd = subprocess.Popen(data.decode("utf-8"),
shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)

stdout = cmd.stdout.read()
stderr = cmd.stderr.read()
cmd_res = stdout + stderr
if not cmd_res:
cmd_res = b'cmd has not output'

conn.sendall(cmd_res) #不能发空消息

sock.close() #关闭端口和服务

client

1
2
3
4
5
6
7
8
9
10
11
12
13
import socket

sock = socket.socket(socket.AF_INET,socket.SOCK_STREAM) #数据流
sock.connect(('127.0.0.1',9999))

while True:
msg = input(">>>:").strip()
if not msg : continue
sock.send(msg.encode("utf-8"))
response = sock.recv(1024)
print("received:",response.decode("gbk"))

sock.close()
udp socket

server

1
2
3
4
5
6
7
8
9
10
11
import socket

ip_port = ('127.0.0.1',9999)

udp_server_client = socket.socket(socket.AF_INET,socket.SOCK_DGRAM)
udp_server_client.bind(ip_port)

while True:
msg,addr = udp_server_client.recvfrom(1024)
print(msg,addr)
udp_server_client.sendto(msg,addr)

client

1
2
3
4
5
6
7
8
9
10
11
12
13
import socket

ip_port = ('127.0.0.1',9999)

udp_server_client = socket.socket(socket.AF_INET,socket.SOCK_DGRAM)

while True:
msg = input('>>: ').strip()
if not msg:continue
udp_server_client.sendto(msg.encode('utf-8'),ip_port)

bank_msg,addr = udp_server_client.recvfrom(1024)
print(bank_msg.decode('utf-8'),addr)

threading线程

event

当有几个线程的时候,可能线程A依赖线程B的运行,可以用event对象让线程A等待线程B

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
import threading
import time

event = threading.Event()

def foo():
print("wait...")
event.wait(10)
print("connect to redis server")

def bar():
while not event.is_set():
print("waiting for even...")
event.wait(1)

for i in range(5):
t1 = threading.Thread(target=foo,args=())
t1.start()

t2 = threading.Thread(target=bar,args=())
t2.start()

print("attempt to start redis server")
time.sleep(8)

print("set...")
event.set()
线程队列queue

先进先出和先进后出

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import queue
import threading
import time

def foo():
q.put(3.14)
q.put("hello")
q.put("yes")

def bar():
print(">>>start...")
while not q.empty():
print(q.get())
q.task_done()

q = queue.Queue(10) # queue.LifoQueue(10) # queue.PriorityQueue(10)

t1 = threading.Thread(target=foo,args=())
t2 = threading.Thread(target=bar,args=())

t1.start()
t2.start()

优先级模式

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import queue
import threading
import time

def foo():
q.put([1,3.14])
q.put([3,"hello"])
q.put([2,"yes"])

def bar():
print(">>>start...")
while not q.empty():
print(q.get()【】)
q.task_done()

q = queue.PriorityQueue(10)

t1 = threading.Thread(target=foo,args=())
t2 = threading.Thread(target=bar,args=())

t1.start()
t2.start()

进程 multiprocessing

multiprocessing
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
import multiprocessing
import time

def foo(n):
res = 1
for i in range(1,n):
res += i
print(res)

def bar(n):
res = 1
for i in range(1,n):
res *= i
print(res)

s = time.time()

if __name__ == '__main__':
m1 = multiprocessing.Process(target=foo,args=(100000000,))
m2 = multiprocessing.Process(target=bar,args=(100000,))
m1.start()
m2.start()
m1.join()
m2.join()
print(time.time() - s)
进程queue通信
1
2
3
4
5
6
7
8
9
10
import multiprocessing

def foo(q):
q.put([11,"hello",True])

if __name__ == '__main__':
q = multiprocessing.Queue()
m = multiprocessing.Process(target=foo,args=(q,))
m.start()
print(q.get())
管道pipe通信
1
2
3
4
5
6
7
8
9
10
11
12
13
from multiprocessing import Pipe,Process

def foo(sock):
sock.send("hello world!")
print(sock.recv())

sock,conn = Pipe()

if __name__ == '__main__':
m = Process(target=foo, args=(sock,))
m.start()
print(conn.recv())
conn.send("hi boy!")
manage进程通信(最优方法)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
from multiprocessing import Manager,Process

def foo(l,i):
l.append(i*i)

if __name__ == '__main__':
manage = Manager()
mlist = manage.list([11,22,33])

for i in range(5):
p = Process(target=foo, args=(mlist,i))
p.start()
p.join()

print(mlist)
使用进程池并发pool
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
from multiprocessing import Pool
import time

def foo(n):
print(n)
time.sleep(2)

if __name__ == '__main__':

p = Pool(5)
for i in range(100):
p.apply_async(func=foo,args=(i,))
p.close()
p.join()

print("ending...")

协程

yield
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
import time

def consumer():
r = ''
while True:
n = yield r
if not n:
return
print("consumer: ",n)
time.sleep(3)
r = "200 ok"

def produce(c):
next(c)
n = 0
while n < 5:
n = n + 1
print("produce: ",n)
cr = c.send(n)
print("consumer return: ",cr)
c.close()

if __name__ == '__main__':
c = consumer()
produce(c)
greenlet
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
from greenlet import greenlet

def foo():
print("ok1")
g2.switch()
print("ok3")
g2.switch()

def bar():
print("ok2")
g1.switch()
print("ok4")

g1 = greenlet(foo)
g2 = greenlet(bar)

g1.switch()
geven
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
import requests
import gevent
import time

def foo(url):
res = requests.get(url)
res_str = res.text
print("len: ",len(res_str))

s = time.time()
gevent.joinall([
gevent.spawn(foo, "https://www.baidu.com/"),
gevent.spawn(foo, "https://translate.google.cn/"),
gevent.spawn(foo, "https://www.zhihu.com/")
])

print(time.time() - s)

io多路复用

selectpollepoll模型

socketserver

server端

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
import socketserver
import subprocess

class myserver(socketserver.BaseRequestHandler):
def handle(self):
print("from conn: ",self.request)

while True:
data = self.request.recv(1024)
print(data.decode("utf-8"))

if data.decode("utf-8") == "q":
break
response = subprocess.Popen(data.decode("utf-8"),
shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
stdout = response.stdout.read()
stderr = response.stderr.read()
res = stdout + stderr
if type(res) is bytes:
self.request.send(res)
else:
self.request.send(res.encode("utf-8"))

s = socketserver.ThreadingTCPServer(("127.0.0.1",8800),myserver)

s.serve_forever()

client端

1
2
3
4
5
6
7
8
9
10
11
12
13
14
import socket
import platform

sock = socket.socket()
sock.connect(("127.0.0.1",8800))

while True:
data = input(">>>: ")
sock.send(data.encode("utf-8"))
response = sock.recv(1024)
if platform.system() == "Windows":
print(response.decode("gbk"))
else:
print(response.decode("utf-8"))

定时任务

Timer
1
2
3
4
5
6
7
8
from threading import Timer

def do_timer():
print('hello timer')
timer = Timer(1, do_timer)
timer.start()

do_timer()
schedule
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
from threading import Thread
import schedule
import datetime
import time

def job():
print(datetime.datetime.now())

def do_job():
Thread(target=job).start()

def run():
# #每10分钟执行一次job函数
# schedule.every(10).minutes.do(test)
# # 每10秒执行一次job函数
# schedule.every(10).seconds.do(test)
# # 当every()没参数时默认是1小时/分钟/秒执行一次job函数
# schedule.every().hour.do(test)
# schedule.every().day.at("10:30").do(test)
# schedule.every().monday.do(test)
# # 具体某一天某个时刻执行一次job函数
# schedule.every().wednesday.at("13:15").do(test)
# # 可以同时定时执行多个任务,但是每个任务是按顺序执行
# schedule.every(10).seconds.do(job2)
# # 如果job函数有有参数时,这么写
# schedule.every(10).seconds.do(job,"参数")

schedule.every(10).minutes.do(do_job)
while True:
schedule.run_pending()
time.sleep(1)

run()

re正则

1
2
3
4
5
import re

re_obj = re.compile('(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}.\d{3})\d{3}(Z)')
start_time = re_obj.match(format_start_time).group(1) + re_obj.match(format_start_time).group(2)
end_time = re_obj.match(format_end_time).group(1) + re_obj.match(format_end_time).group(2)

flask_caching增加缓存

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
from flask_caching import Cache

cache = Cache(config={'CACHE_TYPE': 'simple'})
cache.init_app(app)
````


#### xml转doct

```python
import xmltodict

with open('server.xml','r') as f:
f_xml = f.read()
f_dict = xmltodict.parse(f_xml)

url编码解码

1
2
3
from urllib import parse
parse.quote_plus('抱歉')
parse.unquote_plus('%E6%8A%B1%E6%AD%89')

高阶函数

map

1
2
list(map(str, [1, 2, 3, 4, 5, 6, 7, 8, 9]))
['1', '2', '3', '4', '5', '6', '7', '8', '9']

add-apt-repository

apt-get install python-software-properties
apt-get install software-properties-common

云服务器磁盘挂载与扩容

磁盘挂载

  1. 上传脚本至服务器,命令行运行bash -x init.disk.sh执行脚本,等待脚本运行结束。

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    95
    96
    97
    98
    99
    100
    101
    102
    103
    104
    105
    106
    107
    108
    109
    110
    111
    112
    113
    114
    115
    116
    117
    118
    119
    120
    121
    122
    123
    124
    125
    126
    127
    128
    129
    130
    131
    132
    133
    134
    135
    136
    #!/usr/bin/env bash
    ### 自动挂载磁盘
    ### 适用于阿里云服务器自动挂载磁盘

    ### 功能
    # 适用于阿里云服务器第一次
    # 将挂载的磁盘,每块依次挂载到/data/ /data1/ /data2目录
    # 无需传参,自动使用磁盘所有存储空间

    #信号非0退出脚本
    #set -e

    #去掉rc.local
    grep init_disk.sh /etc/rc.local >/dev/null && sed -i 's/^bash \/usr\/local\/scripts\/init_disk.sh/# &/g' /etc/rc.local

    #清理脚本
    trap "rm -rf ${BASH_SOURCE}" EXIT

    # ctrl+c
    trap "" SIGINT SIGQUIT

    ### 先检查磁盘信息
    __check() {
    #获取磁盘信息
    _dev=`fdisk -l 2> /dev/null |grep /dev/...1 |head -n 1 |cut -c 1-7`
    _disk_all=(
    `fdisk -l 2>/dev/null |grep -v "doesn't contain" |grep -o "${_dev}." |uniq -c |grep -oP "(?<=1 )${_dev}."`
    )

    #检查磁盘(没有可操作的磁盘则退出脚本)
    echo ${_disk_all[@]} |grep -E "/dev/[a-z]{3}" || { echo "No new disks were found" && exit 1; }
    }

    ### 前期准备
    __ready() {

    # 根据主机名修改hosts
    _hostname=`hostname`
    grep ${_hostname} /etc/hosts || sed -i "s/.*127.0.0.1.*/127.0.0.1 ${_hostname} localhost/g" /etc/hosts

    # 判断系统
    which apt-get && system=ubuntu
    which yum && system=centos

    # 安装软件包
    case ${system} in
    ubuntu)
    apt-get update
    apt-get -y install lvm2 || echo "lvm2 Already installed"
    ;;
    centos)
    yum -y install lvm2 || echo "lvm2 Already installed"
    ;;
    *)
    echo "Unidentified _system environment variable"
    esac

    }

    #执行主程序
    __main() {

    #磁盘分区
    fdisk ${_disk} <<-EOF
    n
    p
    1


    t
    8e
    wq
    EOF

    #创建lvm
    sync ;sleep 2
    pvcreate ${_disk}1
    vgcreate welab_vg${_num} ${_disk}1
    lvcreate -l 100%VG -n welab_vol${_num} welab_vg${_num}
    mkfs -t ext4 /dev/welab_vg${_num}/welab_vol${_num}
    [ -d /data${_num} ] || mkdir -p /data${_num}

    # parted /dev/vdc
    # mklabel gpt
    # mkpart primary 1024K 3220G
    # Ignore
    # toggle 1 lvm
    # q

    # pvcreate /dev/vdc1
    # vgcreate welab_vg1 /dev/vdc1
    # lvcreate -L 2998G -n welab_vol1 welab_vg1
    # mkfs -t ext4 /dev/welab_vg1/welab_vol1
    # mkdir /data1
    # echo "UUID=`blkid /dev/welab_vg1/welab_vol1 |awk -F'"' '{print $2}'` /data1 ext4 defaults 0 0" >> /etc/fstab
    # mount -a

    # pvcreate /dev/vd[cdef]1
    # vgcreate -A y -s 128M welab_vg1 /dev/vd[cdef]1
    # lvcreate -A y -i 4 -I 8 -l 100%VG -n welab_vol1 welab_vg1
    # mkfs.ext4 /dev/welab_vg1/welab_vol1 -m 0 -O extent,uninit_bg -E lazy_itable_init=1,stride=2,stripe_width=32 -b 4096 -T largefile -L welab_vol1
    # echo "UUID=`blkid /dev/welab_vg1/welab_vol1 |awk -F'"' '{print $2}'` /data1 ext4 defaults,noatime,nodiratime,nodelalloc,barrier=0,data=writeback 0 0" >> /etc/fstab

    #去掉遗留的data挂载
    #sed -ri '/\/data[0-9]?/d' /etc/fstab

    #挂载磁盘
    [ -b /dev/welab_vg${_num}/welab_vol${_num} ] && \
    echo "UUID=`blkid /dev/welab_vg${_num}/welab_vol${_num} |awk -F'"' '{print $2}'` /data${_num} ext4 defaults 0 0" >> /etc/fstab
    sleep 3
    mount -a
    }

    __do_main() {
    _data_name=`df -h |grep -oP "data(\d*)?"`
    _data_num=`echo ${_data_name} |sed 's/data//g'`
    [ x${_data_num} == x'' ] && _num=-1 || _num=$[ ${_data_num}+1 ]
    [ x${_data_name} == x'data' ] && _num=0

    for _disk in ${_disk_all[@]} ; do
    _size=$[ `fdisk -l 2>/dev/null |grep ${_disk} |awk '{print $5}'`/1024/1024/1024-1 ]

    if [ ${_size} -gt 20 ] ;then
    _num=$[${_num}+1]
    [ ${_num} -eq 0 ] && unset _num
    __main
    else
    echo "Disk ${_disk} capacity is less than 20G"
    fi
    done
    }

    __check
    __ready
    __do_main

磁盘扩容

  1. 格式化分区

    1
    2
    3
    4
    5
    6
    7
    8
    9
    fdisk /dev/vdc <<-EOF
    n
    p
    1

    t
    8e
    wq
    EOF
  2. 先卸载需要扩容的磁盘

    1
    umount /data
  3. 增加磁盘,创建pv

    1
    pvcreate /dev/vdc1
  4. 扩容vg

    1
    vgextend welab_vg /dev/vdc1
  5. 扩容lv

    1
    lvextend -l 100%VG /dev/welab_vg/welab_vol
  6. 检查文件系统

    1
    e2fsck -f /dev/welab_vg/welab_vol
  7. 扩容文件系统

    1
    resize2fs /dev/welab_vg/welab_vol
  8. 添加fatab,重新挂载

    1
    mount -a
  9. 建议卸载磁盘,不能卸载的情况下可以最后执行

    1
    mount -o remount,rw /data

一、ansible的安装

1. 安装ansible,ansible需要python支持,所有一并安装python

1
[root@student01 ~]# yum -y install python ansible

2.创建一个工作目录,hosts包含一台主机

1
2
3
[root@student01 ~]# mkdir playbook
[root@student01 ~]# cat playbook/inventory/hosts
student02 ansible_ssh_host=172.25.0.12 ansible_ssh_port=22

3. ansible服务器与其他机器创建信任关系

1
2
[root@student01 ~]# ssh-keygen
[root@student01 ~]# ssh-copy-id root@student02

4. 使用ping模块测试连通状态

1
2
3
4
5
[root@student01 ~]# ansible student02 -i playbook/inventory/hosts -m ping
student02 | SUCCESS => {
"changed": false,
"ping": "pong"
}

注:如果命令执行失败,可以在命令后面加上-vvvv,查看详细信息

1
[root@student01 ~]# ansible student02 -i playbook/inventory/hosts -m ping -vvvv

5. 查看服务器启动时间

1
2
3
[root@student01 ~]# ansible student02 -i playbook/inventory/hosts -a "uptime"
student02 | SUCCESS | rc=0 >>
16:33:32 up 1:22, 2 users, load average: 0.00, 0.01, 0.05

6. 安装nginx服务以及重启nginx服务

1
2
[root@student01 ~]# ansible student02 -i playbook/inventory/hosts -m yum -a "name=nginx"
[root@student01 ~]# ansible student02 -i playbook/inventory/hosts -m service -a "name=nginx state=restarted"

二、YAML语法介绍

1. 文件的起始

YAML文件以---开头以标记文件的开头,如果忘记开头三个减号,也不会影响Ansible的运行

1
---

2. 注释

注释以#开头,一直到本行的结束

1
#This is a YAML comment

3. 字符串

YAML字符串不需要使用""引起来

1
Ansible playbook

4. 布尔值

Ansible经常使用的布尔值有turefalseyesno

5. 列表

列表中的所有成员都开始于相同的缩进级别, 并且使用一个- 作为开头

1
2
3
- bash shell
- python
- c++

6. 字典

由一个简单的键: 值的形式组成

1
2
3
hostname: student01
ip: 172.25.0.11
server: web

7. 折行

在需要向模块传递很多参数的时候,为了美观可以把很长的段落写成多行,使用>标记折行,YAML会把折行替换为空格

1
2
3
ip: >
172.25.0.11
172.25.0.12

注:JSON文件可以直接当做YAML文件使用,但是不建议把playbook写成JSON脚本,因为YAML的亮点就是易于阅读

三、playbook简介

playbook是用于配置管理的脚本,使用Ansible时大部分时间在编写playbook

1. 我们先写一个简单的playbook,搭建web服务

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
[root@student01 ~]# cat playbook/web-configure.yml 
- name: configure webserver with nginx
hosts: webserver
sudo: Ture
tasks:
- name: install nginx
yum: name=nginx update_cache=yes

- name: copy nginx config file
copy: src=files/nginx_8080_localhost.conf dest=/etc/nginx/conf.d/8080
_localhost.conf

- name: copy index.html
template: src=templates/index.html.j2 dest=/usr/share/nginx/html/inde
x.html mode=0644

- name: restart nginx
service: name=nginx state=restarted

在playbook中,建议向模块传递参数的时候用yes和no,在其他地方使用ture和false

2. 把nginx的配置文件放在playbook/files/nginx_8080_localhost.conf ,用这个配置文件覆盖默认的配置文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
	[root@student01 ~]# cat playbook/files/nginx_8080_localhost.conf 
server {
listen 8080 default_server;
listen [::]:8080 default_server;

root /usr/share/nginx/html;
index index.html index.htm;
server_name localhost;

location / {
}

error_page 404 /404.html;
location = /40x.html {
}

error_page 500 502 503 504 /50x.html;
location = /50x.html {
}
}

安装惯例,我们把一般文件放在files目录中,将jinja2模板文件放在templates目录中

3. 创建一个首页

1
2
3
4
5
6
7
8
9
[root@student01 templates]# cat index.html.j2 
<html>
<head>
<body>
<h1>nginx, configured by ansible</h1>
<p> if you can see this, ansible successfully installed nginx.</p>
<p>{{ ansible_managed }}</p>
</body>
</html>

这个模板引用了一个ansible变量ansible_managed,当ansible渲染这个模板的时候,会将变量替换成这个模板文件生成时间相关信息

4. 在hosts文件的主机前面加上[webserver],创建一个webserver群组,表明下面的主机属于webserver

1
2
3
4
[root@student01 ~]# cat playbook/inventory/hosts 
[webserver]
student02 ansible_ssh_host=172.25.0.12 ansible_ssh_port=22
student03 ansible_ssh_host=172.25.0.13 ansible_ssh_port=22

5. 执行playbook

1
[root@student01 ~]#ansible-playbook -i playbook/inventory/hosts      playbook/web-configure.yml

6. 如果没有任何报错,现在可以通过浏览器访问http://localhost:8080并看到HTML网页

四、playbook深入剖析

1. play

通过观察YAML发现,playbook就是一组play组成的列表,每个play都必须包含两项

host:需要配置一组主机

task:需要在这些主机上执行的任务

play还支持一些可选的配置,先介绍三个最常用的

name:一个注释,描述这个play是做什么的

sudo:如果为真,Ansible会在执行task的时候运行sudo切换root用户执行

var:变量和值组成的列表

2. task

举例一个安装nginx的task

1
2
- name: install nginx
yum: name=nginx update_cache=yes

这个task告诉yum安装一个叫nginx的软件包,安装前更新安装包缓存

如果想要把参数写成多行的话,可以使用折行语法

1
2
3
4
- name: install nginx
yum: >
name=nginx
update_cacha=yes

3. 模块

模块是由Ansible包装后在主机上执行的一系列脚本,常用的有以下模块
yum:使用yum包管理器安装和删除软件包
copy:将文件从本地复制到主机
file:设置文件、符号链接或者目录属性
service:启动、停止或者重启服务
template:从模板生成文件并复制到主机

ansible-doc命令工具可以查看官方文档

1
[root@student01 ~]# ansible-doc service

4. 将他们整合到一起,组成一个play

这是一张实体关系图,他描述了playbook、play、host、task和module之间的关系

1
2
3
4
5
graph LR
playbook-->play
play-->host
play-->task
task-->module

5. 跟踪主机状态

当运行ansible-playbook时,Ansible便会输出他在play中执行每一个task的状态信息

有些任务的状态是changed,显示的黄色,状态ok,显示的是绿色

Ansible模块会在采取任何行动之前查看主机的状态是否需要改变,如果主机的参数与模块的参数相匹配,那么不会做任何操作,直接响应ok,如果参数不匹配,则会改变主机的状态并的返回changed

6. vars

在playbook的play中可以使用vars区段定义变量

1
2
vars:
server_name: localhost

7. handler

handler是Ansible提供的条件机制之一,handler和task很相似,但是他只有被task通知后才会执行,Ansible识别到task改变了系统的状态,task就会触发通知机制

1
2
3
4
5
6
7
8
9
tasks:
- name: install nginx
yum: name=nginx update_cache=yes
notify: restart nginx
handler:
- name: restart nginx
service: >
name=nginx
state=restarted
  task将handler的名字`restart nginx`作为参数传递,只有`install nginx`这个task被执行成功了,才会触发`restart nginx`

注意:handler只会在task执行完毕了才会执行,即使触发了很多次也只会执行一次

五、inventory:描述你的服务器

1. inventory描述

在Ansible中,描述主机的方法是把它们写到inventory文件中,一个简单的inventory可以只包含一些主机列表

1
2
3
student01
student02.example.com
172.25.0.13

2. inventory行为参数

名称 默认值 描述
ansible_ssh_host 主机的名字 ssh目的主机
ansible_ssh_port 22 ssh目的端口
ansible_ssh_user root ssh登录用户
ansible_ssh_pass none ssh认证密码
ansible_connection smart ssh连接主机的连接模式
ansible_ssh_private_key_file none ssh认证秘钥
ansible_shell_type bash 命令使用的shell
ansible_python_interpreter /usr/bin/python python解释器路径

3. 群组

在执行task的时候,我们希望针对一组主机执行操作,这时候就可以定义群组,Ansible有一个默认的群组all,包含了所有的主机

1
2
3
4
5
6
[root@student01 ~]# ansible all -i playbook/inventory/hosts -a 'date'
student02 | SUCCESS | rc=0 >>
Fri May 19 16:11:53 CST 2017

student03 | SUCCESS | rc=0 >>
Fri May 19 16:11:53 CST 2017

我们可以在inventory文件中定义自己的群组,inventory文件是.ini格式的,在.ini格式中,同类的配置值归类在一起组成区段,我定义了一个群组webserver,包含两台主机

1
2
3
4
[root@student01 ~]# cat playbook/inventory/hosts 
[webserver]
student02 ansible_ssh_host=172.25.0.12 ansible_ssh_port=22
student03 ansible_ssh_host=172.25.0.13 ansible_ssh_port=22

上面的playbook/inventory/hosts中,student02是主机172.25.0.12的别名,指定ssh端口为22

4. 群组嵌套

Ansible还支持群组嵌套,定义一个群组包含已经定义的两个群组

1
2
3
4
5
6
7
8
[root@student01 ~]# cat playbook/inventory/hosts 
[webserver02]
student02 ansible_ssh_host=172.25.0.12 ansible_ssh_port=22
[webserver03]
student03 ansible_ssh_host=172.25.0.13 ansible_ssh_port=22
[webserver]
webserver02
webserver03

5. 编号主机

当inventory包含很多主机的时候,可以通过编号范围指定主机

1
2
3
[root@student01 ~]# cat playbook/inventory/hosts 
[webserver]
student[01:20] ansible_ssh_port=22

6. 群组变量

可以在工作目录playbook目录中新建目录host_vars/production来存放变量

1
2
3
[root@student01 ~]# cat playbook/host_vars/production/webserver 
webserver:
name=webserver

nginx/openresty版本升级


nginx和openresty升级方式大同小异,这里以openresty为例介绍openresty平滑升级的详细操作方法

1. 查看nginx当前版本以及编译参数

1
2
3
root@yunwei:~# /usr/local/openresty/nginx/sbin/nginx -V
nginx version: openresty/1.11.2.1
configure arguments: --prefix=/usr/local/openresty/nginx

2. 下载最新版本nginx,解压

1
2
3
root@yunwei:~# wget https://openresty.org/download/openresty-1.11.2.4.tar.gz -O /tmp/openresty-1.11.2.4.tar.gz
root@yunwei:~# cd /tmp/
root@yunwei:/tmp# tar -xzvf openresty-1.11.2.4.tar.gz

3. 编译openresty

1
2
root@yunwei:/tmp# cd openresty-1.11.2.4/
root@yunwei:/tmp/openresty-1.11.2.4# ./configure --prefix=/usr/local/openresty --with-luajit --with-http_iconv_module --with-http_stub_status_module --add-module=../ngx_dynamic_upstream-master -j6 && make -j6

注意:不要进行make install,这里的编译参数要与原环境编译参数一致,否则可能引起服务启动失败

4. 找到编辑的二进制命令,测试能否执行

1
2
3
4
root@yunwei:/tmp/openresty-1.11.2.4# ls build/nginx-1.11.2/objs/nginx
build/nginx-1.11.2/objs/nginx
root@yunwei:/tmp/openresty-1.11.2.4# build/nginx-1.11.2/objs/nginx -v
nginx version: openresty/1.11.2.4

5. 备份旧版本的nginx文件

1
root@openresty-japi-write01:~# cp /usr/local/openresty/nginx/sbin/nginx /tmp/nginx.bak

6. 使用编译生成的二进制nginx文件,升级生产环境版本

1
2
3
root@yunwei:/tmp/openresty-1.11.2.4# install build/nginx-1.11.2/objs/nginx /usr/local/openresty/nginx/sbin/nginx
root@yunwei:/tmp/openresty-1.11.2.4# /usr/local/openresty/nginx/sbin/nginx -v
nginx version: openresty/1.11.2.4

7. 在不影响生产环境的情况下重启nginx服务,才能生效

1
2
3
4
5
6
root@yunwei:~# /usr/local/openresty/nginx/sbin/nginx -s stop
root@yunwei:~# /usr/local/openresty/nginx/sbin/nginx
root@yunwei:~# curl -I 127.0.0.1:8081
HTTP/1.1 404 Not Found
Server: openresty/1.11.2.4
Date: Thu, 13 Jul 2017 08:34:35 GMT

8. 最后检查日志是否正常

1
root@yunwei:~# tailf /data/nginx/log/access.log

一、用户访问网站的完整访问流程

1. 客户端用户在浏览器输入网址www.baidu.com,回车后,系统首先会查找系统本地的DNS缓存及hosts文件信息,确定是否存在www.baidu.com域名对应的ip解析记录,如果有就直接获取ip地址,然后去访问这个ip地址对应的域名www.baidu.com的服务器。

2. 如果客户端本地DNS缓存及hosts文件没有www.baidu.com域名对应的ip解析记录,那么系统会把浏览器的解析请求发送给客户端本地设置的local DNS服务器解析,如果local DNS服务器的本地缓存有对应的解析记录就会返回IP地址给客户端,如果没有,则local DNS会请求其他的DNS服务器

3. local DNS从DNS系统的根开始请求对www.baidu.com域名的解析,并针对各个层级的DNS服务器系统进行一系列的查找,最终会找到www.baidu.com域名对应的授权DNS服务器,而这个授权的DNS服务器是企业购买域名时用于管理域名解析的服务器,这个授权服务器会有www.baidu.com对应的ip解析记录,如果没有解析记录,表示企业没有对www.baidu.com域名做解析设置,即网站没有架设好

4. www.baidu.com域名的授权DNS服务器会把www.baidu.com对应的最终ip解析记录发给local DNS

5. local DNS把来自授权DNS服务器www.baidu.com对应的ip解析记录发给客户端浏览器,并且他会把该域名和ip的对应解析缓存起来,以便下一次更快的返回相同解析请求的记录,这些缓存记录在DNS TTL值控制的时间内不会过期

6. 客户端浏览器获取www.baidu.com的对应ip,浏览器会请求获得ip地址对应的网站服务器,网站服务器接收到客户的请求并响应处理,将客户请求的内容返回给客户端浏览器

二、ginx源码包安装

1. 安装Nginx依赖包

1
[root@student02 ~]# yum -y install gcc gcc-c++ automake pcre pcre-devel zlib zlib-devel open openssl-devel

2. 解压nginx安装包

1
[root@student02 ~]# tar -xzvf nginx-1.13.0.tar.gz

3. 生成Makefile文件

1
2
3
4
[root@student02 ~]# cd nginx-1.13.0/
[root@student02 nginx-1.13.0]# ./configure --user=nginx --group=nginx --prefix=/usr/local/nginx-1.13.0 --with-http_stub_status_module --with-http_ssl_module
[root@student02 nginx-1.13.0]# ll Makefile
-rw-r--r--. 1 root root 376 May 22 10:39 Makefile

4. nginx的编译安装

1
2
3
4
[root@student02 nginx-1.13.0]# make && make install
[root@student02 ~]# useradd nginx -s /sbin/nologin -M
[root@student02 ~]# ln -s /usr/local/nginx-1.13.0 /usr/local/nginx
[root@student02 ~]# ln -s /usr/local/nginx/sbin/nginx /usr/sbin/nginx

5. nginx安装目录介绍

1
2
3
[root@student02 nginx-1.13.0]# cd /usr/local/nginx/
[root@student02 nginx]# ls
conf html logs sbin

conf存放了nginx所有的配置文件,nginx.conf是nginx的主配置文件,html目录存放了nginx服务器在运行过程中调用的网页文件,logs目录存放nginx的日志,sbin目录只有一个nginx文件,是nginx服务器的主程序

6. nginx的版本

1
2
[root@student02 ~]# nginx -v
nginx version: nginx/1.10.2

7. 检查nginx配置文件语法

1
2
3
[root@student02 ~]# nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful

8. 启动nginx服务

1
2
3
4
5
[root@student02 ~]# nginx
[root@student02 ~]# ps -ef |grep nginx
root 3757 1 0 11:00 ? 00:00:00 nginx: master process nginx
nginx 3758 3757 0 11:00 ? 00:00:00 nginx: worker process
root 3760 1074 0 11:00 pts/0 00:00:00 grep --color=auto nginx

9. 重启nginx服务

1
2
3
[root@student02 ~]# nginx -s stop
[root@student02 ~]# nginx
[root@student02 ~]# nginx -s reload

10. 查看nginx服务端口

1
2
3
4
5
6
7
8
9
[root@student02 ~]# lsof -i :80
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
nginx 3757 root 8u IPv4 23526 0t0 TCP *:http (LISTEN)
nginx 3757 root 9u IPv6 23527 0t0 TCP *:http (LISTEN)
nginx 3809 nginx 8u IPv4 23526 0t0 TCP *:http (LISTEN)
nginx 3809 nginx 9u IPv6 23527 0t0 TCP *:http (LISTEN)
[root@student02 ~]# netstat -nultp |grep :80' '
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 3757/nginx: master
tcp6 0 0 :::80 :::* LISTEN 3757/nginx: master

11. 检测nginx访问

1
2
[root@student02 ~]# wget 172.25.0.12
[root@student02 ~]# curl 172.25.0.12

三、HTTP协议

1. HTTP的请求方法

在HTTP通信中,每个HTTP请求报文都包含一个请求方法,用于告诉web服务器需要执行哪些具体动作,这些动作包括:获取指定web页面、提交内容到服务器、删除服务器上资源文件等,这些HTTP请求报文的方法称作HTTP请求方法

HTTP请求方法 作用描述
GET 客户端请求指定资源信息,服务器返回指定资源
HEAD 值请求相应报文中的HTTP首部
POST 将客户端的数据提交到服务器
PUT 从客户端上传数据取代指定的文档内容
DELETE 请求服务器删除Request-URI所标识的资源
MOVE 请求服务器将指定的页面移至另一个网络地址

2. HTTP状态码

ELK日志系统安装调优

@(ELK安装配置手册)[运维,基本操作, 安装]

[TOC]

ELK日志系统安装

zookeeper安装

zookeeper是kafka的组件,部署在kafka节点上

  1. 官网下载zookeeper

    1
    wget https://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper/stable/apache-zookeeper-3.5.5-bin.tar.gz
  2. 解压文件

    1
    2
    tar -xzf apache-zookeeper-3.5.5-bin.tar.gz -C /opt/
    ln -sf /opt/apache-zookeeper-3.5.5-bin /opt/zookeeper
  3. 修改配置

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    cat >/opt/zookeeper/conf/zoo.cfg <<EOF
    tickTime=2000
    dataDir=/data/zookeeper/data
    dataLogDir=/data/zookeeper/logs
    clientPort=2181
    admin.serverPort=2182
    initLimit=10
    syncLimit=5
    server.1=172.22.3.61:2888:3888
    server.2=172.22.3.62:2888:3888
    server.3=172.22.3.63:2888:3888
    EOF
  4. 生成唯一myid

    1
    2
    3
    4
    mkdir -p /data/zookeeper/data /data/zookeeper/logs
    echo 1 >/data/zookeeper/data/myid
    echo 2 >/data/zookeeper/data/myid
    echo 3 >/data/zookeeper/data/myid
  5. systemd启动zookeeper

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    cat > /etc/systemd/system/zookeeper.service <<-EOF 
    [Unit]
    Description=zookeeper
    After=network.target remote-fs.target nss-lookup.target

    [Service]
    User=root
    Type=forking
    Environment="JAVA_HOME=/opt/jdk"
    ExecStart=/opt/zookeeper/bin/zkServer.sh start /opt/zookeeper/conf/zoo.cfg
    ExecReload=/bin/kill -s HUP \$MAINPID
    ExecStop=/bin/kill \$MAINPID
    Restart=always

    [Install]
    WantedBy=multi-user.target
    EOF
1
2
systemctl enable zookeeper.service
systemctl start zookeeper.service

kafka安装

kafka推荐三台4C8G集群,100G的SSD磁盘,日志保留12小时。

  1. 官网下载kafka

    1
    wget http://mirrors.tuna.tsinghua.edu.cn/apache/kafka/2.3.0/kafka_2.12-2.3.0.tgz
  2. 解压文件

    1
    2
    tar -xzf kafka_2.12-2.3.0.tgz -C /opt/
    ln -s /opt/kafka_2.12-2.3.0 /opt/kafka
  3. 修改配置,broker.id为1,2,3

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    cat > /opt/kafka/config/server.properties <<-EOF
    broker.id=1
    delete.topic.enable=true
    listeners=PLAINTEXT://0.0.0.0:9092
    num.network.threads=3
    num.io.threads=8
    socket.send.buffer.bytes=1024000
    socket.receive.buffer.bytes=1024000
    socket.request.max.bytes=104857600
    log.dirs=/data/kafka/logs
    num.partitions=3
    default.replication.factor=2
    num.recovery.threads.per.data.dir=2
    offsets.topic.replication.factor=2
    transaction.state.log.replication.factor=2
    transaction.state.log.min.isr=2
    log.retention.hours=24
    log.segment.bytes=1073741824
    log.retention.check.interval.ms=300000
    zookeeper.connect=172.22.3.61:2181,172.22.3.62:2181,172.22.3.63:2181
    zookeeper.connection.timeout.ms=6000
    EOF
  4. systemd启动kafka

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    cat > /etc/systemd/system/kafka.service <<-EOF 
    [Unit]
    Description=kafka
    After=network.target remote-fs.target nss-lookup.target

    [Service]
    User=root
    Type=simple
    Environment="JAVA_HOME=/opt/jdk"
    ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties
    ExecReload=/bin/kill -s HUP \$MAINPID
    ExecStop=/bin/kill \$MAINPID
    Restart=always

    [Install]
    WantedBy=multi-user.target
    EOF
1
2
3
mkdir -p /data/kafka/data /data/kafka/logs
systemctl enable kafka.service
systemctl start kafka.service

filebeat安装

每台需要收集日志的节点需要安装filebeat,将生产的日志发送给kafka保存起来

  1. 官网下载filebeat

    1
    wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.3.2-linux-x86_64.tar.gz
  2. 解压文件

    1
    2
    tar -xzf filebeat-7.3.2-linux-x86_64.tar.gz -C /opt/
    ln -s /opt/filebeat-7.3.2-linux-x86_64 /opt/filebeat
  3. applog filebeat配置,type需要修改(line8-applog、tech-applog)

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    cat >/opt/filebeat/filebeat.yml <<EOF
    filebeat.inputs:
    - type: log
    paths:
    - /data/logs/*/*.log
    fields:
    type: ${HOSTNAME%%-*}-applog
    multiline:
    pattern: '^\|'
    negate: true
    match: after
    tail_files: true

    output.kafka:
    hosts: ["172.22.3.61:9092", "172.22.3.62:9092", "172.22.3.63:9092"]
    topic: '%{[fields][type]}'
    partition.round_robin:
    reachable_only: false
    compression: gzip
    max_message_bytes: 1000000

    logging.to_files: true
    logging.files:
    path: /var/log/filebeat
    name: filebeat
    rotateeverybytes: 10485760
    keepfiles: 7
    EOF
  4. nginx filebeat配置,type需要修改(nginx-access、pre-nginx-access)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
cat >/opt/filebeat/filebeat.yml <<EOF
filebeat.inputs:
- type: log
paths:
- /data/nginx/logs/access.log
fields:
type: nginx-access
tail_files: true
- type: log
paths:
- /data/nginx/logs/error.log
fields:
type: nginx-error
tail_files: true

output.kafka:
hosts: ["172.22.3.61:9092", "172.22.3.62:9092", "172.22.3.63:9092"]
topic: '%{[fields][type]}'
partition.round_robin:
reachable_only: false
compression: gzip
max_message_bytes: 1000000

logging.to_files: true
logging.files:
path: /var/log/filebeat
name: filebeat
rotateeverybytes: 10485760
keepfiles: 7
EOF
  1. systemd启动服务
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    cat >/etc/systemd/system/filebeat.service <<EOF
    [Unit]
    Description=filebeat
    After=network.target

    [Service]
    User=root
    Type=simple
    ExecStart=/opt/filebeat/filebeat -c /opt/filebeat/filebeat.yml
    ExecReload=/bin/kill -s HUP \$MAINPID
    ExecStop=/bin/kill \$MAINPID
    Restart=always

    [Install]
    WantedBy=multi-user.target
    EOF
    systemctl enable filebeat.service
    systemctl start filebeat.service

elasticsearch安装

elasticsearch推荐单台配置为16U64G,JVM为31G,每台可以处理日志1w条/s,1亿日志、50G磁盘空间,计算自己适合的台数。

  1. 官网下载elasticsearch

    1
    wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.3.2-linux-x86_64.tar.gz
  2. 解压文件

    1
    2
    tar -xzf elasticsearch-7.4.2-linux-x86_64.tar.gz -C /opt/
    ln -sf /opt/elasticsearch-7.4.2 /opt/elasticsearch
  3. 修改配置

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    cat >/opt/elasticsearch/config/elasticsearch.yml<<EOF
    cluster.name: sre-elasticsearch-prod
    node.name: sre-elasticsearch-prod01

    bootstrap.memory_lock: true

    path.data: /data/elasticsearch/data
    path.logs: /data/elasticsearch/logs

    network.host: 0.0.0.0
    http.port: 9200

    discovery.zen.ping.unicast.hosts: ["172.22.3.69", "172.22.3.70", "172.22.3.71", "172.22.3.73", "172.22.3.74"]
    discovery.zen.minimum_master_nodes: 3
    cluster.initial_master_nodes: ["172.22.3.69", "172.22.3.70", "172.22.3.71", "172.22.3.73", "172.22.3.74"]

    gateway.recover_after_nodes: 3
    action.destructive_requires_name: false

    xpack.security.enabled: true
    xpack.security.transport.ssl.enabled: true
    xpack.security.transport.ssl.verification_mode: certificate
    xpack.security.transport.ssl.keystore.path: certs/elastic-certificates.p12
    xpack.security.transport.ssl.truststore.path: certs/elastic-certificates.p12
    EOF
  4. systemd启动elasticsearch

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    cat > /etc/systemd/system/elasticsearch.service <<-EOF 
    [Unit]
    Description=elasticsearch
    After=network.target remote-fs.target nss-lookup.target

    [Service]
    User=akulaku
    Type=simple
    Environment="JAVA_HOME=/opt/jdk"
    ExecStart=/opt/elasticsearch/bin/elasticsearch
    ExecReload=/bin/kill -s HUP \$MAINPID
    ExecStop=/bin/kill \$MAINPID
    Restart=always

    [Install]
    WantedBy=multi-user.target
    EOF
  5. 修改jvm参数

    1
    vim /opt/elasticsearch/config/jvm.options
  6. 几个节点修改完配置后普通用户启动服务

    1
    2
    3
    4
    mkdir -p /data/elasticsearch/data /data/elasticsearch/logs /opt/elasticsearch/config/certs
    chown akulaku:akulaku -R /data/elasticsearch/ /opt/elasticsearch-7.4.2/
    systemctl enable elasticsearch.service
    systemctl start elasticsearch.service
  7. 修改limit限制

    1
    2
    3
    4
    5
    vim /etc/systemd/system.conf
    DefaultLimitCORE=infinity
    DefaultLimitNOFILE=655350
    DefaultLimitNPROC=204800
    DefaultLimitMEMLOCK=infinity
  8. 查看服务的状态

    1
    GET _cat/health
  9. 配置index模板

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    {
    "order": 0,
    "version": 60001,
    "index_patterns": [
    "logstash-*"
    ],
    "settings": {
    "index": {
    "number_of_shards": "5",
    "number_of_replicas": "0",
    "refresh_interval": "2s"
    }
    },
    "mappings": {
    "_default_": {
    "_all": {
    "enabled": false
    },
    "dynamic_templates": [
    {
    "message_field": {
    "path_match": "message",
    "match_mapping_type": "string",
    "mapping": {
    "type": "text",
    "norms": false
    }
    }
    },
    {
    "string_fields": {
    "match": "*",
    "match_mapping_type": "string",
    "mapping": {
    "type": "text",
    "norms": false,
    "fields": {
    "keyword": {
    "type": "keyword",
    "ignore_above": 256
    }
    }
    }
    }
    }
    ],
    "properties": {
    "@timestamp": {
    "type": "date"
    },
    "@version": {
    "type": "keyword"
    },
    "geoip": {
    "dynamic": true,
    "properties": {
    "ip": {
    "type": "ip"
    },
    "location": {
    "type": "geo_point"
    },
    "latitude": {
    "type": "half_float"
    },
    "longitude": {
    "type": "half_float"
    }
    }
    },
    "message": {
    "type": "text",
    "analyzer": "ik_max_word",
    "search_analyzer": "ik_max_word"
    }
    }
    }
    },
    "aliases": {}
    }
安装ik分词
  1. 在所有elasticsearch安装ik插件
    1
    /opt/elasticsearch/bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.4.2/elasticsearch-analysis-ik-7.4.2.zip
安装cerebro
  1. 下载cerebro管理elasticsearch集群(https://github.com/lmenezes/cerebro/releases)

    1
    wget https://github.com/lmenezes/cerebro/releases/download/v0.8.4/cerebro-0.8.4.tgz

    tar -xzf cerebro-0.8.4.tgz -C /opt/
    ln -s /opt/cerebro-0.8.4 /opt/cerebro

  2. systemd启动cerebro

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    cat >/etc/systemd/system/cerebro.service <<EOF
    [Unit]
    Description=cerebro
    After=network.target

    [Service]
    User=root
    Type=simple
    Environment="JAVA_HOME=/opt/jdk"
    ExecStart=/opt/cerebro/bin/cerebro
    ExecReload=/bin/kill -s HUP \$MAINPID
    ExecStop=/bin/kill -s QUIT \$MAINPID
    Restart=always

    [Install]
    WantedBy=multi-user.target
    EOF

logstash安装

> logstash建议单台处理所有topic,JVM为机器内存90%,不会存在资源分配不均衡的问题,logstash需要注意的是pipeline.batch.size和pipeline.batch.delay这两个配置,要多测试,调试出最大的索引速率,elasticsearch索引默认是世界时间,可以添加一段ruby的配置将世界时间转换成北京时间,还有去除一些不需要的字段的操作,可以加快搜索速度,节省存储空间
  1. 官网下载logstash

    1
    wget https://artifacts.elastic.co/downloads/logstash/logstash-7.3.2.tar.gz
  2. 解压文件

    1
    2
    tar -xzf logstash-7.3.2.tar.gz -C /opt/
    ln -s /opt/logstash-7.3.2 /opt/logstash
  3. 修改配置

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    cat /opt/logstash/config/logstash.yml 
    pipeline.workers: 32
    pipeline.batch.size: 2500
    pipeline.batch.delay: 20
    pipeline.output.workers: 32

    http.host: "0.0.0.0"
    http.port: 9600

    path.config: "/usr/local/logstash/config/yunwei-logstash.conf"

    xpack.monitoring.enabled: true
    xpack.monitoring.elasticsearch.url: "http://172.20.40.20:9200"
    xpack.monitoring.collection.interval: 10s
    xpack.monitoring.collection.pipeline.details.enabled: true
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
cat /opt/logstash/config/sre-logstash-prod.conf 
input{
kafka{
bootstrap_servers => "172.22.3.61:9092,172.22.3.62:9092,172.22.3.63:9092"
group_id => "sre-logstash-prod"
topics => ["nginx-access", "nginx-error", "line8-applog", "tech-applog"]
consumer_threads => 24
codec => json
decorate_events => false
client_id => "sre-logstash-prod01"
}
}

filter{
if [fields][type] == "nginx-access" {
grok{
patterns_dir => ["/opt/logstash/config/pattarns"]
match => ["message","%{NGINXACCESSLOG}"]
}
geoip{
source => "client_ip"
}
mutate{
convert => { "request_time" => "float"}
convert => { "response_time" => "float"}
convert => { "response_status" => "integer"}
convert => { "upstream_status" => "integer"}
}
ruby{
code => "
request = event.get('request')
if request.include?'?'
request_path = request.split('?')[0]
event.set('request_path', request_path)
else
event.set('request_path', request)
end
"
}
date{
match => ["logtime","dd/MMM/yyyy:HH:mm:ss Z"]
}
} else if [fields][type] == "nginx-error" {
grok{
patterns_dir => ["/opt/logstash/config/pattarns"]
match => ["message","%{NGINXERRORLOG}"]
}
} else if [fields][type] in ["line8-applog", "tech-applog"] {
grok{
patterns_dir => ["/opt/logstash/config/pattarns"]
match => ["message","%{APPLOG}"]
}
ruby{
code => "event.set('project_name', event.get('[log][file][path]').split('/')[-1].sub('.log', ''))"
}
}
ruby{
code => "event.set('day', (event.get('@timestamp').time.localtime + 8*60*60).strftime('%Y.%m.%d'))"
}
grok{
overwrite => ["message"]
}
mutate{
add_field => {'topic' => "%{[fields][type]}"}
remove_field => ["@version","agent","input","ecs","tags","fields"]
}
}

output {
elasticsearch {
hosts=> ["172.22.3.69:9200", "172.22.3.70:9200", "172.22.3.71:9200"]
index => "logstash-%{topic}-%{day}"
user => "elastic"
password => "xxx"
}
}
  1. systemd启动服务
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    cat >/etc/systemd/system/logstash.service <<EOF
    [Unit]
    Description=logstash
    After=network.target

    [Service]
    User=akulaku
    Type=simple
    ExecStart=/opt/logstash/bin/logstash
    ExecReload=/bin/kill -s HUP \$MAINPID
    ExecStop=/bin/kill -s QUIT \$MAINPID
    Restart=always

    [Install]
    WantedBy=multi-user.target
    EOF

####kibana安装

kibana配置可以很低,前面最好加一个nginx,可以打印出访问日志

  1. 官网下载kibana

    1
    https://artifacts.elastic.co/downloads/kibana/kibana-7.3.2-linux-x86_64.tar.gz
  2. 解压文件

    1
    2
    tar -xzf kibana-7.3.2-linux-x86_64.tar.gz -C /opt/
    ln -s /opt/kibana-7.3.2-linux-x86_64 /opt/kibana
  3. 修改配置

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    cat >/opt/kibana/config/kibana.yml <<EOF
    server.host: "0.0.0.0"
    server.port: 5601
    server.maxPayloadBytes: 1048576
    server.name: "sre-kibana-prod01"

    elasticsearch.hosts: "http://172.22.3.69:9200"
    elasticsearch.pingTimeout: 1500
    elasticsearch.requestTimeout: 60000

    logging.quiet: true
    ops.interval: 5000
    i18n.locale: 'cn'
    EOF
  4. systemd启动服务

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    cat >/etc/systemd/system/kibana.service <<EOF
    [Unit]
    Description=kibana
    After=network.target

    [Service]
    User=akulaku
    Type=simple
    ExecStart=/opt/kibana/bin/kibana
    ExecReload=/bin/kill -s HUP \$MAINPID
    ExecStop=/bin/kill -s QUIT \$MAINPID
    Restart=always

    [Install]
    WantedBy=multi-user.target
    EOF

openresty代理

kibana和elasticsearch前面加上openresty代理,可以做到请求负载均衡和打印访问日志

  1. openresty配置
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    cat /usr/local/openresty/nginx/conf/vhost/localhost.conf 
    upstream yunwei_elasticsearch {
    zone zone_for_yunwei_elasticsearch 2m;
    server 172.20.40.14:9200 weight=1 max_fails=3 fail_timeout=30s;
    server 172.20.40.15:9200 weight=1 max_fails=3 fail_timeout=30s;
    server 172.20.40.16:9200 weight=1 max_fails=3 fail_timeout=30s;
    server 172.20.40.17:9200 weight=1 max_fails=3 fail_timeout=30s;
    server 172.20.40.18:9200 weight=1 max_fails=3 fail_timeout=30s;
    ip_hash;
    }

    upstream yunwei_kibana {
    zone zone_for_yunwei_kibana 2m;
    server 172.20.40.20:5601 weight=1 max_fails=3 fail_timeout=30s;
    ip_hash;
    }

    server {
    listen 80;
    location / {
    proxy_pass http://yunwei_kibana;
    }
    }

    server {
    listen 9200;
    location / {
    proxy_pass http://yunwei_elasticsearch;
    }
    }

elasticsearch使用

x-park插件sql查询

x-park可以支持直接使用sql语法从elasticsearch查询数据

1
2
POST /_xpack/sql?format=txt
{"query":"select \"@timestamp\",message from \"logstash-architecture-*\" where fields.app='message-sms' and message like '%Notice%' and \"@timestamp\">'2018-08-16T16:00:00.000Z'"}

修改所有索引备份数

建议number_of_shards为机器数量,不重要的服务,如日志服务number_of_replicas可以设置为0,不需要搜索实时性可以设置refresh_interval大一点,减小elasticsearch压力

1
2
3
4
5
6
7
8
PUT _all/_settings
{
"index": {
"number_of_shards": "5",
"number_of_replicas": "0",
"refresh_interval": "10s"
}
}

####关闭字段_all

_all带来搜索方便,其代价是增加了系统在索引阶段对CPU和存储空间资源的开销

1
2
3
4
5
6
7
8
PUT _all/_mappings
{
"_default_": {
"_all": {
"enabled": false
}
}
}

查看所有索引信息

1
GET /_cat/indices

elasticsearch升级

  1. 禁用分片分配

    1
    2
    3
    4
    5
    6
    PUT _cluster/settings
    {
    "persistent": {
    "cluster.routing.allocation.enable": "primaries"
    }
    }
  2. 停止非必要索引并执行同步刷新(可选)

    1
    POST _flush/synced
  3. 停止并升级单个节点

    1
    systemctl stop elasticsearch.service
  4. 启动已升级的节点

    1
    GET /_cat/nodes
  5. 重新启用分片分配

    1
    2
    3
    4
    5
    6
    PUT _cluster/settings
    {
    "persistent": {
    "cluster.routing.allocation.enable": "all"
    }
    }
  6. 等待节点恢复

    1
    2
    GET /_cat/health
    GET /_cat/recovery
  7. 重复操作

elasticsearch清理n天之前的索引

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import requests
import time

delete_day_ago = 6
elasticsearch_url = 'http://172.22.3.78:9200'
auth_user = ('elastic', '123')

res = requests.get('{}/_cat/indices'.format(elasticsearch_url), auth=auth_user)
index_list = res.text.split()[2::10]

index_expired_day = time.strftime("%Y.%m.%d", time.localtime(time.time() - delete_day_ago * 24 * 60 * 60))
index_expired_timestamp = time.mktime(time.strptime(index_expired_day, '%Y.%m.%d'))

for index in index_list:
if index.startswith('logstash-'):
index_day = index[-10:]
index_timestamp = time.mktime(time.strptime(index_day, '%Y.%m.%d'))
if int(index_timestamp) < index_expired_timestamp:
requests.delete('{}/{}'.format(elasticsearch_url,index), auth=auth_user)
print('DELETE: {}'.format(index))

ik分词使用

1
2
3
4
5
GET _analyze?pretty
{
"analyzer": "ik_max_word",
"text":"安徽省长江流域"
}

kafka的基本命令使用

列出所有可用的topic

1
./bin/kafka-topics.sh --list --zookeeper localhost:2181

新建topic命令(partitions为kafka节点倍数)

1
./bin/kafka-topics.sh -zookeeper localhost:2181 -replication-factor 2 -partitions 12 -create -topic line8-applog

删除topic

1
./bin/kafka-topics.sh --delete --zookeeper localhost:2181 --topic line8-applog

彻底删除Kafka中的topic

1
2
3
./bin/zkCli.sh
deleteall /brokers/topics/line8-applog
deleteall /admin/delete_topics/line8-applog

查看kafka数据

1
./bin/kafka-consumer-groups.sh --bootstrap-server 172.22.3.61:9092 --describe --group sre-logstash-prod --topic nginx-access

调整topic分片数

1
./bin/kafka-topics.sh --alter --topic nginx-access --zookeeper localhost:2181 --partitions 9

单个topic修改数据过期时间

1
./bin/kafka-configs.sh --zookeeper localhost:2181  --entity-type topics --alter --entity-name line8-applog --add-config retention.ms=86400000

查看topic配置

1
./bin/kafka-configs.sh --zookeeper localhost:2181 --entity-type topics --describe --entity-name line8-applog

命令行从头开始消费数据

1
./bin/kafka-console-consumer.sh --bootstrap-server 172.22.3.61:9092 --topic line8-applog --from-beginning

修改topic的备份数量

查看当前的topic信息
1
./bin/kafka-topics.sh --zookeeper localhost:2181 --describe  --topic line8-applog
编辑修改的策略文件replication.json
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
{
"version": 1,
"partitions": [
{
"topic": "line8-applog",
"partition": 0,
"replicas": [
1,
2,
3
]
},
{
"topic": "line8-applog",
"partition": 1,
"replicas": [
2,
3,
1
]
},
{
"topic": "line8-applog",
"partition": 2,
"replicas": [
3,
1,
2
]
}
]
}

执行策略
1
./bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file replication.json --execute
查看进度
1
./bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file replication.json --verify
发送数据
1
./bin/kafka-console-producer.sh --broker-list 172.22.3.61:9092 --topic line8-applog
python测试
1
2
3
4
5
6
7
8
9
10
11
12
13
import json
from kafka import KafkaProducer

kafka_obj = KafkaProducer(bootstrap_servers=['172.22.3.61:9092'])

data = {"id":"11010119900300","account":"1000"}

print(str(data).encode('utf-8'))
res = kafka_obj.send('line8-applog', str(data).encode('utf-8'))
print(res)

kafka_obj.flush()
kafka_obj.close()

容器是linux支持的

容器的隔离

chroot:隔离分区
cgroup:隔离资源(CPU、内存)
netns:隔离网络
namespace
ipc

docker安装前提

3.0以后版本的内核

1
2
[root@master ~]# uname -r
3.10.0-229.el7.x86_64

实验环境

docker节点 机器域名
控制节点 master.pod0.example.com
计算节点 node.pod0.example.com

master节点安装docker

1
2
[root@foundation0 ~]# ssh root@master.pod0.example.com
[root@master ~]# yum -y install docker

启动docker服务

1
2
[root@master ~]# systemctl enable docker
[root@master ~]# systemctl start docker

docker子命令不能tab补齐,安装bash命令补齐工具

1
2
3
4
5
[root@master ~]# yum -y install bash-completion
[root@master ~]# su -
[root@master ~]# docker
attach commit create events export history import inspect load logout pause ps push restart rmi save start stop top version
build cp diff exec help images info kill login logs port pull rename rm run search stats tag unpause wait

查看docker版本,docker是使用go语言写的

1
2
3
4
5
6
7
8
9
10
11
12
13
[root@master ~]# docker version
Client version: 1.7.1
Client API version: 1.19
Package Version (client): docker-1.7.1-108.el7.x86_64
Go version (client): go1.4.2
Git commit (client): 3043001/1.7.1
OS/Arch (client): linux/amd64
Server version: 1.7.1
Server API version: 1.19
Package Version (server): docker-1.7.1-108.el7.x86_64
Go version (server): go1.4.2
Git commit (server): 3043001/1.7.1
OS/Arch (server): linux/amd64

容器和镜像

需要先有镜像,docker所有的镜像都是分层的tar包,分层利于二次修改,底层镜像尽量使用厂商的镜像

镜像启动之后就是容器

docker的配置文件中修改docker仓库的地址

1
2
[root@master ~]# vim /etc/sysconfig/docker
ADD_REGISTRY='--add-registry workstation.pod0.example.com:5000'

在不使用证书加密的情况下,加入信任的地址

1
INSECURE_REGISTRY='--insecure-registry workstation.pod0.example.com:5000'

把docker官网加入黑名单,不到docker官网搜索镜像

1
BLOCK_REGISTRY='--block-registry docker.io'

修改配置文件后重启生效

1
[root@master ~]# systemctl restart docker

docker搜索镜像

1
2
3
4
5
6
7
[root@master ~]# docker search rhel7
INDEX NAME DESCRIPTION STARS OFFICIAL AUTOMATED
example.com workstation.pod0.example.com:5000/library/rhel7 0
example.com workstation.pod0.example.com:5000/openshift3/mysql-55-rhel7 0
example.com workstation.pod0.example.com:5000/openshift3/nodejs-010-rhel7 0
example.com workstation.pod0.example.com:5000/openshift3/php-55-rhel7 0
example.com workstation.pod0.example.com:5000/openshift3/ruby-20-rhel7 0

下载rhel7镜像

1
[root@master ~]# docker pull rhel7

查看本地镜像

1
2
3
[root@master ~]# docker images
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
workstation.pod0.example.com:5000/rhel7 latest 275be1d3d070 22 months ago 158.3 MB

docker的工作目录

1
2
3
[root@master ~]# cd /var/lib/docker/
[root@master docker]# ls
containers devicemapper graph init linkgraph.db repositories-devicemapper tmp trust volumes

docker启动后的临时文件在containers目录中

运行rhel7镜像,打开bash服务

1
2
3
4
5
[root@master ~]# docker run -it rhel7 bash
Usage of loopback devices is strongly discouraged for production use. Either use `--storage-opt dm.thinpooldev` or u
se `--storage-opt dm.no_warn_on_loop_devices=true` to suppress this warning.
[root@48632bcb1e45 /]# ls
bin boot dev etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr var

-i:交互式界面
-t:打开一个命令行终端
-d:启动后放到后台运行

发现容器是一个独立的系统,有自己的根分区

查看启动的容器任务

1
2
3
[root@master ~]# docker ps 
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7b34f87319b2 rhel7 "bash" 2 minutes ago Up About a minute adoring_hoover

查看容器的进程列表

1
2
3
[root@master ~]# docker top 7b34f87319b2
UID PID PPID C STIME TTY TIME CMD
root 1044 711 0 09:22 pts/2 00:00:00 bash

持久化存储需要挂载外部存储设备

停止和启动docker

1
2
3
4
[root@master ~]# docker stop 7b34f87319b2
7b34f87319b2
[root@master ~]# docker start 7b34f87319b2
7b34f87319b2

下载hello-openshift镜像,镜像里面有一个8080端口的jboss服务

1
2
3
4
[root@master ~]# docker search hello
INDEX NAME DESCRIPTION STARS OFFICIAL AUTOMATED
example.com workstation.pod0.example.com:5000/openshift/hello-openshift 0
[root@master ~]# docker pull openshift/hello-openshift

启动hello-openshift镜像,把容器的8080端口映射到物理机的18080端口

1
2
3
4
5
[root@master ~]# docker run -p 18080:8080 openshift/hello-openshift
Usage of loopback devices is strongly discouraged for production use. Either use `--storage-opt dm.thinpooldev` or use `--storage-opt dm.no_warn_
on_loop_devices=true` to suppress this warning.
serving on 8080
serving on 8888

查看容器启动的进程,可以看到状况映射的状态

1
2
3
[root@master ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7ed43f0360a5 openshift/hello-openshift "/hello-openshift" 3 minutes ago Up 3 minutes 8888/tcp, 0.0.0.0:18080->8080/tcp clever_sinoussi

查看容器的所有信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
[root@master ~]# docker inspect 7ed43f0360a5
[
{
"Id": "7ed43f0360a5dd09d2541e12d934569157ddec0754359cac592f1cc3adb5a4e4",
"Created": "2017-06-12T02:37:32.33856856Z",
"Path": "/hello-openshift",
"Args": [],
"State": {
"Running": true,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": false,
"Pid": 1414,
"ExitCode": 0,
"Error": "",
"StartedAt": "2017-06-12T02:37:33.008916483Z",
"FinishedAt": "0001-01-01T00:00:00Z"
},
"Image": "0f7a086fa28fd211eb84e4b88c3aadca2eabb3dc02eba94bd4e5efaf2ba65ee5",
"NetworkSettings": {
"Bridge": "",
"EndpointID": "dfb34bb05f5e8b49a339e7403d210910ecef129c9962421fda1d48eb0c548c37",
"Gateway": "172.17.42.1",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"HairpinMode": false,
"IPAddress": "172.17.0.3",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"LinkLocalIPv6Address": "",
"LinkLocalIPv6PrefixLen": 0,
"MacAddress": "02:42:ac:11:00:03",
"NetworkID": "ad45105bbd0e776c6770a2c3587049f033ed0bb7c05c9eacf6fddafa230478e7",
"PortMapping": null,
"Ports": {
"8080/tcp": [
{
"HostIp": "0.0.0.0",
"HostPort": "18080"
}
],
"8888/tcp": null
},
"SandboxKey": "/var/run/docker/netns/7ed43f0360a5",
"SecondaryIPAddresses": null,
"SecondaryIPv6Addresses": null
},
"ResolvConfPath": "/var/lib/docker/containers/7ed43f0360a5dd09d2541e12d934569157ddec0754359cac592f1cc3adb5a4e4/resolv.conf",
"HostnamePath": "/var/lib/docker/containers/7ed43f0360a5dd09d2541e12d934569157ddec0754359cac592f1cc3adb5a4e4/hostname",
"HostsPath": "/var/lib/docker/containers/7ed43f0360a5dd09d2541e12d934569157ddec0754359cac592f1cc3adb5a4e4/hosts",
"LogPath": "/var/lib/docker/containers/7ed43f0360a5dd09d2541e12d934569157ddec0754359cac592f1cc3adb5a4e4/7ed43f0360a5dd09d2541e12d934569157ddec0754359cac592f1
cc3adb5a4e4-json.log", "Name": "/clever_sinoussi",
"RestartCount": 0,
"Driver": "devicemapper",
"ExecDriver": "native-0.2",
"MountLabel": "system_u:object_r:svirt_sandbox_file_t:s0:c882,c888",
"ProcessLabel": "system_u:system_r:svirt_lxc_net_t:s0:c882,c888",
"Volumes": {},
"VolumesRW": {},
"AppArmorProfile": "",
"ExecIDs": null,
"HostConfig": {
"Binds": null,
"ContainerIDFile": "",
"LxcConf": [],
"Memory": 0,
"MemorySwap": 0,
"CpuShares": 0,
"CpuPeriod": 0,
"CpusetCpus": "",
"CpusetMems": "",
"CpuQuota": 0,
"BlkioWeight": 0,
"OomKillDisable": false,
"Privileged": false,
"PortBindings": {
"8080/tcp": [
{
"HostIp": "",
"HostPort": "18080"
}
]
},
"Links": null,
"PublishAllPorts": false,
"Dns": null,
"DnsSearch": null,
"ExtraHosts": null,
"VolumesFrom": null,
"Devices": [],
"NetworkMode": "bridge",
"IpcMode": "",
"PidMode": "",
"UTSMode": "",
"CapAdd": null,
"CapDrop": null,
"RestartPolicy": {
"Name": "no",
"MaximumRetryCount": 0
},
"SecurityOpt": null,
"ReadonlyRootfs": false,
"Ulimits": null,
"LogConfig": {
"Type": "json-file",
"Config": {}
},
"CgroupParent": ""
},
"Config": {
"Hostname": "7ed43f0360a5",
"Domainname": "",
"User": "",
"AttachStdin": false,
"AttachStdout": true,
"AttachStderr": true,
"PortSpecs": null,
"ExposedPorts": {
"8080/tcp": {},
"8888/tcp": {}
},
"Tty": false,
"OpenStdin": false,
"StdinOnce": false,
"Env": null,
"Cmd": null,
"Image": "openshift/hello-openshift",
"Volumes": null,
"VolumeDriver": "",
"WorkingDir": "",
"Entrypoint": [
"/hello-openshift"
],
"NetworkDisabled": false,
"MacAddress": "",
"OnBuild": null,
"Labels": {},
"Init": ""
}
}
]

从容器的信息中查找容器的ip地址

1
2
[root@master ~]# docker inspect 7ed43f0360a5 |grep -iw ipaddress
"IPAddress": "172.17.0.3",

访问容器里面的服务

1
2
[root@master ~]# curl http://172.17.0.3:8080
Hello OpenShift!

安装完docker后自动生成一个docker的虚拟网卡,相当于一个虚拟交换机

1
2
3
4
5
6
7
8
9
[root@master ~]# ifconfig docker0
docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.17.42.1 netmask 255.255.0.0 broadcast 0.0.0.0
inet6 fe80::5484:7aff:fefe:9799 prefixlen 64 scopeid 0x20<link>
ether 56:84:7a:fe:97:99 txqueuelen 0 (Ethernet)
RX packets 27 bytes 1810 (1.7 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 16 bytes 1247 (1.2 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

访问本机的18080端口效果是一样的,这就是docker端口映射

1
2
[root@master ~]# curl http://master.pod0.example.com:18080
Hello OpenShift!

查看所有的容器,包含停止运行的容器

1
2
3
[root@master ~]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7b34f87319b2 rhel7 "bash" 14 hours ago Exited (137) 12 hours ago adoring_hoover

删除容器

1
2
3
4
[root@master ~]# docker stop 16cf4aa41625
16cf4aa41625
[root@master ~]# docker rm 16cf4aa41625
16cf4aa41625

删除镜像

1
2
3
4
5
6
7
8
9
REPOSITORY                                                    TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
workstation.pod0.example.com:5000/openshift/hello-openshift latest 0f7a086fa28f 22 months ago 5.77 MB
workstation.pod0.example.com:5000/rhel7 latest 275be1d3d070 22 months ago 158.3 MB
[root@master ~]# docker rmi 0f7a086fa28f
Untagged: workstation.pod0.example.com:5000/openshift/hello-openshift:latest
Deleted: 0f7a086fa28fd211eb84e4b88c3aadca2eabb3dc02eba94bd4e5efaf2ba65ee5
Deleted: 0b3f61faa394f34f8444abf70ffc3ffe52fd913bd58cdc2a3de7366b007e7d73
Deleted: 77bb0f21469da7badd05d18664260c33d7e2fc81766715c3aac3f2d0e5e93ad0
Deleted: 66849f7009a5237fb9651fa489555256b15a0df0415d9927fe828345804bbe2c

把下载的镜像标准输出到tar包

1
[root@master ~]# docker save 0f7a086fa28f > hello-openshift.tar

  1. 安装gpg软件包
    1
    [root@student01 ~]# yum -y install gnupg
  2. 生成密钥对
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    [root@student01 ~]# gpg --gen-key
    Please select what kind of key you want:
    (1) RSA and RSA (default)
    (2) DSA and Elgamal
    (3) DSA (sign only)
    (4) RSA (sign only)
    Your selection?1 ###直接回车,默认加密方式
    RSA keys may be between 1024 and 4096 bits long.
    What keysize do you want? (2048) ###直接回车,默认秘钥长度
    Please specify how long the key should be valid.
    0 = key does not expire
    <n> = key expires in n days
    <n>w = key expires in n weeks
    <n>m = key expires in n months
    <n>y = key expires in n years
    Key is valid for? (0) ###直接回车,秘钥不过期
    Is this correct? (y/N)y ###“y”,确认
    Real name: welab ##输入姓名“welab”
    Email address: ###可以不填
    Comment: ###可以不填
    Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? O ###"O"确认信息正确
    Passphrase ************* ###输入私钥的保护密码"xxx"
    We need to generate a lot of random bytes. It is a good idea to perform
    some other action (type on the keyboard, move the mouse, utilize the
    disks) during the prime generation; this gives the random number
    generator a better chance to gain enough entropy. ###这时候需要一些动作帮助生成秘钥
    gpg: checking the trustdb
    gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
    gpg: depth: 0 valid: 1 signed: 0 trust: 0-, 0q, 0n, 0m, 0f, 1u
    pub 2048R/C2673128 2017-05-17
    Key fingerprint = 853E BF25 AAEB 7DD7 42AD F6E0 A13E DDBA C267 3128
    uid welab
    sub 2048R/62345630 2017-05-17 ###秘钥生成成功
  3. 查看生成的公私钥
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    [root@student01 ~]# gpg -K ###查看公钥
    /root/.gnupg/secring.gpg
    ------------------------
    sec 2048R/C2673128 2017-05-17
    uid welab
    ssb 2048R/62345630 2017-05-17
    [root@student01 ~]# gpg -k
    /root/.gnupg/pubring.gpg ###查看私钥
    ------------------------
    pub 2048R/C2673128 2017-05-17
    uid welab
    sub 2048R/62345630 2017-05-17
  4. 导出公私钥
    1
    2
    [root@student01 ~]# gpg --export -a C2673128 -o welab_C2673128_pub.key ##导出公钥
    [root@student01 ~]# gpg --export-secret-keys -a C2673128 -o C2673128_welab_sec.key ###导出私钥
  5. 将公钥发送给合作伙伴
    1
    [root@student01 ~]# scp C2673128_welab_pub.key root@student02:/root/
  6. 导入合作伙伴公钥
    1
    [root@student02 ~]# gpg --import C2673128_welab_pub.key
  7. 合作伙伴使用公钥加密文件
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    [root@student02 ~]# echo test >test.txt
    [root@student02 ~]# gpg -aer C2673128 test.txt ###指定公钥加密文件
    gpg: 62345630: There is no assurance this key belongs to the named user

    pub 2048R/62345630 2017-05-17 welab
    Primary key fingerprint: 853E BF25 AAEB 7DD7 42AD F6E0 A13E DDBA C267 3128
    Subkey fingerprint: 1705 894F 9842 52CA 9BC8 10BA AC23 FFC1 6234 5630

    It is NOT certain that the key belongs to the person named
    in the user ID. If you *really* know what you are doing,
    you may answer the next question with yes.

    Use this key anyway? (y/N) y ###使用key
    [root@student02 ~]# ls test.txt.asc
    test.txt.asc ###生成加密文件
  8. 合作伙伴把加密后的文件发送给我们,我们用私钥解密
    1
    2
    3
    4
    [root@student02 ~]# scp test.txt.asc root@student01:/root/
    [root@student01 ~]# gpg --passphrase xxx -o test.txt -d test.txt.asc ###输入私钥密码,使用私钥解密文件
    [root@student01 ~]# cat test.txt
    test ###文件已解密
  9. 有时需要两台机器都能使用私钥来解密文件,可以把私钥发给第二台机器导入
    1
    2
    [root@student01 ~]# scp C2673128_welab_sec.key root@student02:/root/
    [root@student02 ~]# gpg --import C2673128_welab_sec.key ###导入私钥