Browse Source

Add 02-ops-dev article

master
Wilfried OLLIVIER 2 months ago
parent
commit
eb3244cf40
1 changed files with 336 additions and 0 deletions
  1. 336
    0
      content/post/02-ops-dev.md

+ 336
- 0
content/post/02-ops-dev.md View File

@@ -0,0 +1,336 @@
1
+---
2
+title: "Dear ops, it's 2019 and you need dev skills"
3
+subtitle: "Evolve, or die in pain"
4
+date: 2019-06-21
5
+draft: false
6
+tags: [ops, dev, devops, opinion]
7
+---
8
+
9
+# To infinity... and beyond!
10
+
11
+    DevOps, IaC, AGILE, SCRUM, Containers, SysOps, CI/CD, microservices, serverless, FAAS
12
+
13
+Yes, it's a buzzwords **shitstorm** but all of these concepts summarize how
14
+_software development_ and _deployment_ evolves. Do i think that sysadmins
15
+time is over ? **No**.
16
+
17
+But time as come, _we_, as sysadmins, have to adapt to this moving
18
+industry.
19
+
20
+Accept it or die, but [cloud
21
+computing](https://en.wikipedia.org/wiki/Cloud_computing) is the norm now.
22
+Small IT companies with "good old LAMP" as the only business model will die
23
+soon.
24
+
25
+Dont' be scared, take a deep breathe and let's dive into this !
26
+
27
+# DevOps is the new sexy
28
+
29
+## DevOps = "Dev" + "Ops"
30
+
31
+_DevOps_, _SecDevOps_, _DevOps Enginner_ are the most used terms in job
32
+requests. But, after all, what the fuck is that shit ?
33
+
34
+In most cases, IT projets human resources can be split into two teams :
35
+
36
+- [Dev](https://en.wikipedia.org/wiki/Programmer) : This team write
37
+application code. Pushed by marketing teams, clients or users, the main
38
+purpose of the dev team is **adding new feature** to the software.
39
+
40
+- [Ops](https://en.wikipedia.org/wiki/System_administrator) : This team have
41
+to maintain operational conditions and ensure that **prod is up, up to date
42
+and stable**.
43
+
44
+By means, this two job positions are in conflict. On one side, ops team wants
45
+stable and predictable stuff, on the other side, dev team wants liberty to
46
+move quickly in order to add new features and meet commercial goals and
47
+objectives.
48
+
49
+What could be done to satisfy both worlds ? **DevOps**
50
+
51
+## DevOps culture
52
+
53
+First of all, no, DevOps is not a job title _(even if everyone on LinkedIn
54
+thinks so)_. It's a culture, a way of thinking and the more important point
55
+of this is : it tells how to organize software development teams.
56
+
57
+The purpose is to put ops and devs to an agreement. Afterall, everyone works for the
58
+same goal : **build the best product, work less and make more money**.
59
+
60
+DevOps presents a new way of thinking how teams collaborate. The main purpose
61
+of DevOps is to open discussions between Ops and Dev. Dev have to be aware of
62
+how the code is deployed and how the production systems are handled by the
63
+ops team. On the other hand, ops needs some dev skills in order to understand
64
+correcly what type of software is served in production.
65
+
66
+To sum it up, DevOps could be presented using a loop of all these concepts and
67
+actions :
68
+
69
+- Define a task, or feature, focused on needs from customers or clients
70
+- Implement this feature or execute the defined task
71
+- Code review
72
+- Tests (unit tests, integration tests, staging tests)
73
+- Functionnal testing on a preproduction environment
74
+- Push to production
75
+- Handle and analyse feedback from production
76
+
77
+In order to archive all this tasks, teams have to communicate clearly on what
78
+should be done by each team and in some cases, there is an overlap between
79
+dev tasks and ops tasks but the final goal is the same : get to the best
80
+possible result.
81
+
82
+## Devops methods and concepts
83
+
84
+Now that the culture is presented, let's take a look at how this principles
85
+can be applied to the real world.
86
+
87
+### Contract
88
+
89
+The top most important thing is what I call a **contract**. This is an
90
+agreement that create a link between dev and ops. A contract needs to be as
91
+descriptive as possible. On one hand developpers needs to tell to administrators
92
+all of the project details, what needs to be run and what are the
93
+dependencies or services required by the application. On the other hand,
94
+operators have to understand dev needs and do all the plumbing to deploy dev
95
+requests to production.
96
+
97
+In most cases, this contract can be represented by
98
+[containers](https://www.docker.com/resources/what-container) and a
99
+[docker-compose](https://docs.docker.com/compose/) file. It's declarative,
100
+easy to read, easy to understand and clear enough to know what needs to be
101
+run and what kind of plumbing is needed to make all services works together
102
+to create the whole application.
103
+
104
+If this not clear enough, here is a generic example :
105
+
106
+
107
+{{< highlight yaml "linenos=table" >}}
108
+version: '3'
109
+
110
+services:
111
+  web:
112
+    image: galaxies:version
113
+    volumes:
114
+      - ./src:/dest
115
+    depends_on:
116
+     - db
117
+    ports:
118
+     - 8080
119
+    label:
120
+        - "frontend.rule=Host:galaxies.rick"
121
+
122
+  db:
123
+    image: postgres
124
+    volumes:
125
+      - ./mounts/db_data:/var/lib/postgresql/data
126
+    environment:
127
+      POSTGRES_USER: RICK
128
+      POSTGRES_PASSWORD: C137
129
+      POSTGRES_DB: galaxies
130
+{{< / highlight >}}
131
+
132
+An ops, receiving this file, can extract a lot of information on how the
133
+application works.
134
+
135
+- A web app that :
136
+    - is accepting requests on port 8080
137
+    - is responding using the url _galaxies.rick_
138
+    - needs persitent storage
139
+    - needs a a postgres database
140
+
141
+Now let's explain what should be done by ops to push this to production :
142
+
143
+0. Ensure that _galaxies.rick_ DNS points to production environment
144
+0. Pull galaxies:version image
145
+0. Ensure an access to a database (can be a container, a cluster or a standalone pg)
146
+0. Inject database variables into production environment
147
+0. Start the galaxies:version image
148
+0. Update HTTP reverse proxy rules to redirect _galaxies.rick_ to _galaxies:version_ on port _8080_
149
+
150
+And **voilà**, now you have a clear line between what kind of stuff
151
+developers will push to production and how the operators will plug the
152
+project on the production environment.
153
+
154
+### Automation
155
+
156
+    Okay, okay ! But as an ops, I don't want to take care of all this stuff everytime dev needs to push a new version of the sotfware !
157
+
158
+Me too, and this is why the second top most important thing is **automation** !
159
+
160
+Take a look at all the tasks described earlier, do you really want to make all
161
+those changes by hand using vim on the production environment ?
162
+
163
+Ten or even twenty years ago, the first automated things was machine boostrap
164
+and basic configuration using scripts. Modern applications requirements means
165
+more machines, more complexity. The easieast way to handle this new level is
166
+to delegate some of the tasks to computers using **declarative** structures.
167
+
168
+This is why tools like _Ansible_ are now popular and widely used. Today we
169
+want to describe a state and let tools do the stuff needed to get to this
170
+state. Why ? Because this is the simpliest way to normalize how things have
171
+to be done[^1] and to get complex systems up and running. If there is a bug
172
+or a missing feature in one of this tool, there is a good chance that you
173
+will have to put your hands in the grease and code it.
174
+
175
+This first step of automation gives ops opportunity to work on tools
176
+associated to concepts like [Continuous
177
+Integration](https://en.wikipedia.org/wiki/Continuous_integration) and
178
+[Continuous Delivery](https://en.wikipedia.org/wiki/Continuous_delivery).
179
+
180
+Do not forget that current apps do more things than serving a simple website.
181
+By using small iterations, dev gain the ability to make small changes often
182
+in contrast to doing major updates. This way of develivering software enhance
183
+stability because most of the code base is not changed between deployments
184
+to production. If something goes wrong it will be a lot easier to bisect to
185
+the root causes of this bug or unexpected app response.
186
+
187
+To give dev the power to deliver atomic changes to pre-production, staging
188
+and production, they need to deploy all the stuff by themselves in a
189
+**predictable and reproducible** and the answer is automation !
190
+
191
+### Metrics & alerting
192
+
193
+```
194
+With great power comes great responsibility !
195
+```
196
+
197
+This is why metrics matters. If devs can deploy stuff to prod, they also need
198
+to know if everything is working as expected and no, I will not give them
199
+root access to production !
200
+
201
+Enabling low level metrics ensure to ops that production is up and running
202
+smoothly but this is not enough to apply DevOps principles. Devs also need
203
+**visitiliy** on how the app is handling requests (success / error rates) or
204
+some status about the queue system. Every team needs different sets of
205
+metrics, specific to their missions.
206
+
207
+With metrics, comes **alerting**. If metrics are well defined, alterting can
208
+be routed to people who are aware of what can be done to resolve the problem.
209
+For example, if the app goes crazy, a member of the dev team will be in the
210
+right position to take action and fix the problem but if it's a proxy memory
211
+leak, an ops will probably know what to do and to dig, to find and resolve
212
+the issue.
213
+
214
+Performance monitoring is as mandatory as alerts. Teams wants to be sure that
215
+the newly pushed feature is working and that there is no performance drop
216
+somewhere else in the application.
217
+
218
+### Culture, sharing and empathy
219
+
220
+    Sharing is caring
221
+
222
+Sharing is one of the key to apply DevOps concepts and culture. In this
223
+situation, sharing is not restricted to **communication** but also
224
+**responsabilities**
225
+
226
+If dev teams **share responsabilities** with ops team, there is a all new
227
+field of possibilities to collaborate and simplify deployment and
228
+maintenance. For ops team, a better communication and shared responsabilities
229
+with dev ensure that they have access to informations regarding businesses
230
+goals, productions requirements.
231
+
232
+If devs and ops are both responsible of failure, or sucess, of the product,
233
+there is less possibility to fall in a blame counter-blame situation. With
234
+more communication and a strong trust chain between members of ops and dev
235
+teams, everyone gets autonomy and a voice in dev or deployment process.
236
+
237
+Last but not least, a lots of **empathy** is needed. Failure is a success !
238
+It's one of the best way to learn new things but this is only possible in a
239
+safe and tolerant environment. The war between dev and ops is over. Listening
240
+and talking to eveyrone will probably help every member of the teams.
241
+
242
+## Limits
243
+
244
+Of course, DevOps is not perfect. When misunderstood, this can be quite
245
+catastrophic. Reducing DevOps to a buzzword is as silly as taking everything
246
+as Truth in the holy Bible. DevOps is a mentality, you will have to create
247
+your own DevOps culture.
248
+
249
+A lots of managers and leaders thinks that DevOps means, fire half of the
250
+devs , half of the admin, mixup, and voilà. Is this sharing ? No ! Doing
251
+DevOps means more distributed responsability and less stress for everyone.
252
+
253
+Another wrong idea is to think that more deployments means more features.
254
+Integrating deployments automations and testing is only used to enhance
255
+robusteness of the all system. The idea is not to make more feature, but more
256
+small changes in order to increase poduction stability.
257
+
258
+# A new open field
259
+
260
+After all that, there is clearly an emerging need of programming skills
261
+common to all the principles and methods presented.
262
+
263
+## Why Industry needs (smart)ops ?
264
+
265
+A smartops is someone who clearly understand that the IT industry is
266
+changing. Everything is moving to the **cloud**, more and more services are
267
+externalized and everything becomes more and more automated. All this stuff
268
+creates a violent shift between two sets of methods.
269
+
270
+- an old one
271
+    - launch command in a terminal using ssh
272
+    - bash scripts to setup things
273
+    - edit file directly on production using _vim_
274
+
275
+- a new one
276
+    - pipelines
277
+    - automation
278
+    - services interactions between HTTP services
279
+
280
+No I'm not saying that ssh is dead. I'm saying that methods evolves.
281
+
282
+As more automation means less human action, there is clearly a move to
283
+descriptive infrastructure deployments and internal services doing all the
284
+plumbing stuff needed to get a stable and viable production.
285
+
286
+In order to achieve all this new challenges, industry needs to delegate tasks
287
+to smart programs write using code. This new services and automation programs
288
+have to be **written by ops**, beceause they are the ones who trully knows
289
+how to run production systems at scale. But, sorry to say that, perl and bash
290
+scripts can do that kind of jobs. More automation of everything also means
291
+automation of the most complex tasks in the stack and this is where scripting
292
+langages are not enough.
293
+
294
+For the ones who thinks I'm wrong, maybe. But here is my opinion based on a
295
+lots of bash and perl script experiments. When things needs at least
296
+parallelism, http requests and response manipulations, strong error handling
297
+or ability to push stuff inside a monitoring stack, golang will be my choice
298
+and I deeply think it should be yours too because this is the main purpose
299
+of this kind of new languages, created specificaly to answer dev and ops
300
+problems.
301
+
302
+Moving from scripting to programming will also help smartops understand how
303
+software they put in production works. By knowing how to construct a
304
+software, ops will gain the ability to help devs, integrating every one in a
305
+DevOps culture.
306
+
307
+## New profiles, new horizons
308
+
309
+Yes, ops **needs dev skills** in order to get a role in teams resolving new
310
+challenges that comes with modern infrastructures and cloud infrastructures.
311
+
312
+This changing ecosystem also gives evolution ability to ops and dev. With
313
+efforts, everyone can, at least, take a look at how roles and interconnection
314
+between ops and dev works. To be clear, i'm saying that dev also needs ops
315
+skills ! But i keep that for another article, stay tuned.
316
+
317
+If old ops don't want to make the effort, that's not a problem because new kind of
318
+people get what is happening. Believe it or not but it's real. The smartops
319
+community is **inclusive**. Even if this is not perfect yet, the _golang_ and
320
+_k8s_ is clearly LGBT and women friendly !
321
+
322
+I want to thanks all the _LBGT_||women gophers||ops[^2] I follow because they
323
+are the roots of this wonderful and refreshing community[^3]. The best thing
324
+I can do is to invite you to follow this people ! Here is the list :
325
+
326
+- [Jessie Frazelle](https://twitter.com/jessfraz)
327
+- [Ashley McNamara](https://twitter.com/ashleymcnamara)
328
+- [Ellen Korbes](https://twitter.com/ellenkorbes)
329
+- [Kris Nova](https://twitter.com/krisnova)
330
+- [Jaana B. Dogan](https://twitter.com/rakyll)
331
+- [Francesc Campoy](https://twitter.com/francesc)
332
+- [Aditya Mukerjee](https://twitter.com/chimeracoder)
333
+
334
+[^1]: Alice likes apt-get, Bob likes aptitude ? I don't care, I just want a standardized way to install a package
335
+[^2]: This is an inclusive OR
336
+[^3]: If you want to be removed or added to the list, just send me a tweet or whatever.

Loading…
Cancel
Save