Flink real combat – order payment and reconciliation monitoring (implemented by CEP and processfunction respectively)

Time:2021-5-3

In the e-commerce website, the payment of orders, as a link directly linked with money, is very important in the business process. For orders, in order to control the business process correctly and increase the user’s willingness to pay, the website will generally set a payment expiration time, and the orders that have not been paid for a period of time will be cancelled. In addition, for order payment, we should also ensure the correctness of the final payment. We can do a real-time reconciliation through the transaction data of the third-party payment platform

The first effect is to obtain the order data in real time, analyze the payment status of the order, and count the successful payment and the payment overtime after 15 minutes in real time

Create a new Maven project. This is a basic dependency. If it was introduced before, you don’t need to add it

<properties>

 <maven.compiler.source>`8`</maven.compiler.source>
 <maven.compiler.target>`8`</maven.compiler.target>
 <flink.version>`1.10.1`</flink.version>
 <scala.binary.version>`2.12`</scala.binary.version>
 <kafka.version>`2.2.0`</kafka.version>
 </properties>
 <dependencies>
 <dependency>
 <groupId>`org.apache.flink`</groupId>
 <artifactId>`flink-scala_${scala.binary.version}`</artifactId>
 <version>`${flink.version}`</version>
 </dependency>
 <dependency>
 <groupId>`org.apache.flink`</groupId>
 <artifactId>`flink-streaming-scala_${scala.binary.version}`</artifactId>
 <version>`${flink.version}`</version>
 </dependency>
 <dependency>
 <groupId>`org.apache.kafka`</groupId>
 <artifactId>`kafka_${scala.binary.version}`</artifactId>
 <version>`${kafka.version}`</version>
 </dependency>
 <dependency>
 <groupId>`org.apache.flink`</groupId>
 <artifactId>`flink-connector-kafka_${scala.binary.version}`</artifactId>
 <version>`${flink.version}`</version>
 </dependency>
 <dependency>
 <groupId>`cn.hutool`</groupId>
 <artifactId>`hutool-all`</artifactId>
 <version>`5.5.6`</version>
 </dependency>
 <dependency>
 <groupId>`org.apache.flink`</groupId>
 <artifactId>`flink-table-planner-blink_2.12`</artifactId>
 <version>`1.10.1`</version>
 </dependency>
 </dependencies>

This scenario needs to use CEP, so add CEP dependency

<dependencies>

 <dependency>
 <groupId>`org.apache.flink`</groupId>
 <artifactId>`flink-cep-scala_${scala.binary.version}`</artifactId>
 <version>`${flink.version}`</version>
 </dependency>
 </dependencies>

Prepare the data source file Src / main / resources / orderlog.csv:

`1234,`**create**`,,`1611047605

1235,create,,1611047606

1236,create,,1611047606

1234,pay,akdb3833,1611047616

Change the Java directory to scala and create a new object named com.mafei.orderpaymonitor.ordertimeoutmonitor.scala

/*

*

* @author mafei

* @date 2021/1/31

*/

package com.mafei.orderPayMonitor
import org.apache.flink.cep.{PatternSelectFunction, PatternTimeoutFunction}
import org.apache.flink.cep.scala.CEP
import org.apache.flink.cep.scala.pattern.Pattern
import org.apache.flink.streaming.api.TimeCharacteristic
import org.apache.flink.streaming.api.scala.{OutputTag, StreamExecutionEnvironment, createTypeInformation}
import org.apache.flink.streaming.api.windowing.time.Time
import java.util

/**

*_ Define the input sample class type_

*

* @param orderId_ Order ID_

* @param eventType_ Event category: create order or pay order_

* @param txId_ Payment serial number_

* @param ts_ Time_

*/

case class OrderEvent(orderId: Long, eventType:String,txId: String, ts: Long)

/**

*_ Define the output sample class type_

*/

case class OrderResult(orderId: Long, resultMsg: String)

object OrderTimeoutMonitor {
 def main(args: Array[String]): Unit = {
 val env = StreamExecutionEnvironment.getExecutionEnvironment
 `env.setParallelism(`1`)`
 env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
_//  1__、 Reading data from a file_
 `val resource = getClass.getResource(`"/OrderLog.csv"`)`
 val orderEvnetStream = env.readTextFile(resource.getPath)
 .map(d=>{
 `val arr = d.split(`","`)`
`OrderEvent(arr(`0`).toLong,arr(`1`),arr(`2`), arr(`3`).toLong)` _//__ Read out the data and convert it to the desired sample class type_
`}).assignAscendingTimestamps(_. ts *` 1000`L)` _//__ Specify the TS field_
`.keyBy(_. orderId)` _//__ Group by order ID_
 _/**_

* 2__、 Defining event matching patterns

*_ Define order creation and payment within 15 minutes_

*/

 val orderPayPattern = Pattern
`.begin[OrderEvent](`"create"`).where(_. eventType ==` "create"`)` _//__ An order creation event appears first_
`.followedBy(`"pay"`).where(_. eventType ==` "pay"`)` _//__ There will be another payment event later_
`.within(Time.minutes(`15`))` _//__ Defined within 15 minutes, trigger these two events_
_//  3__、 The pattern is applied to the flow for pattern detection_
 val patternStream = CEP.pattern(orderEvnetStream, orderPayPattern)
_// 4__、 Define a side output stream label to handle timeout events_
 `val orderTimeoutTag = new OutputTag[OrderResult](`"orderTimeout"`)`
_//  5__、 Call the select method to extract and process the matching successful character event and timeout event_
 val resultStream = patternStream.select(
 orderTimeoutTag,
 new OrderTimeoutSelect(),
 new OrderPaySelect()
 )
 `resultStream.print(`"pay"`)`
 resultStream.getSideOutput(orderTimeoutTag).print()
 `env.execute(`" order timeout monitor"`)`
 }
}

//__ The case that the defined event has not been triggered after obtaining the timeout is that the order payment has timed out.

class OrderTimeoutSelect() extends PatternTimeoutFunction[OrderEvent, OrderResult]{

 override def timeout(map: util.Map[String, util.List[OrderEvent]], l: Long): OrderResult = {
 `val timeoutOrderId = map.get(`"create"`).iterator().next().orderId`
`Order result (timeoutorderid, '"timeout...".... Timeout: '+ L)`
 }
}

class OrderPaySelect() extends PatternSelectFunction[OrderEvent, OrderResult]{

 override def select(map: util.Map[String, util.List[OrderEvent]]): OrderResult = {
 `val orderTs = map.get(`"create"`).iterator().next().ts`
 `val paydTs = map.get(`"pay"`).iterator().next().ts`
 `val payedOrderId = map.get(`"pay"`).iterator().next().orderId`
`Orderresult (payedorderid, '"order payment succeeded, order time:' + orderts + '" payment time:' + paydts)`
 }
}

Use processfunction to implement the above scenario
CSV can also create a new Scala object Src / main / Scala / COM / Mafei / orderpaymonitor / ordertimeoutmonitorwithprocessfunction.scala with the above data

/*
 *
 * @author mafei
 * @date 2021/1/31
*/
package com.mafei.orderPayMonitor
import org.apache.flink.api.common.state.{ValueState, ValueStateDescriptor}
import org.apache.flink.streaming.api.TimeCharacteristic
import org.apache.flink.streaming.api.functions.KeyedProcessFunction
import org.apache.flink.streaming.api.scala.{OutputTag, StreamExecutionEnvironment, createTypeInformation}
import org.apache.flink.util.Collector
object OrderTimeoutMonitorWithProcessFunction {
 def main(args: Array[String]): Unit = {

val env = StreamExecutionEnvironment.getExecutionEnvironment

env.setParallelism(1)

env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)

//1. Read data from the file

val resource = getClass.getResource(“/OrderLog.csv”)

val orderEventStream = env.readTextFile(resource.getPath)

.map(d=>{

val arr = d.split(“,”)

Orderevent (arr (0). Tolong, arr (1), arr (2), arr (3). Tolong) / / read the data and convert it to the desired sample class type

}).assignAscendingTimestamps(_. TS * 1000L) / / specify the TS field

.keyBy(_. OrderID) / / group by order ID

val resultStream = orderEventStream

.process(new OrderPayMatchProcess())

Resultstream. Print (“successful payment):)

Resultstream. Getsideoutput (New outputtag [orderresult]). Print (“order timeout event”)

Env.execute (“order payment monitoring with processfunction”)

 }
}
class OrderPayMatchProcess() extends KeyedProcessFunction[Long, OrderEvent, OrderResult]{
// 
First, define the status identifier to identify create, paid, whether it has appeared, and the corresponding time stamp
 `lazy val isCreateOrderState: ValueState[`Boolean`] = getRuntimeContext.getState(new ValueStateDescriptor[`Boolean`](`"isCreateOrderState", classOf[Boolean]`))`
 `lazy val isPayedOrderState: ValueState[`Boolean`] = getRuntimeContext.getState(new ValueStateDescriptor[`Boolean`](`"isPayedOrderState", classOf[Boolean]`))`
 `lazy val timerTsState : ValueState[`Long`] = getRuntimeContext.getState(new ValueStateDescriptor[`Long`](`"timerTsState", classOf[Long]`))`
// 
Define a side output stream to capture the order information of timeout
 `val orderTimeoutOutputTag = new OutputTag[`OrderResult`](`"timeout"`)`
 override def onTimer(timestamp: Long, ctx: KeyedProcessFunction[Long, OrderEvent, OrderResult]#OnTimerContext, out: Collector[OrderResult]): Unit = {

//At this point, order creation and payment will not exist at the same time, because they will be disposed of in processelement

//If only the order is created

if (isCreateOrderState.value()){

CTX. Output (ordertimeoutputtag, orderresult (CTX. Getcurrentkey, “order not paid or overtime”))

}else if(isPayedOrderState.value()){

CTX. Output (ordertimeoutputtag, orderresult (CTX. Getcurrentkey, “only payment, no order submission”)

}

isCreateOrderState.clear()

isPayedOrderState.clear()

timerTsState.clear()

 }
 override def processElement(i: OrderEvent, context: KeyedProcessFunction[Long, OrderEvent, OrderResult]#Context, collector: Collector[OrderResult]): Unit = {

/**

  • Judge whether the current event type is create or pay
  • There are several situations
  • 1. Judge that both create and pay are here
  • To see whether there is a timeout, no timeout on the normal output
  • Timeout output to side output stream
  • 2. One of create or pay didn’t come
  • Register a timer to wait, and then wait for the timer to trigger before output

*

*/

val isCreate = isCreateOrderState.value()

val isPayed = isPayedOrderState.value()

val timerTs = timerTsState.value()

//1. Create is coming

if (i.eventType == “create”){

//1.1 if the payment has been made, it is the result of normal payment and successful output matching

if (isPayed){

isCreateOrderState.clear()

isPayedOrderState.clear()

timerTsState.clear()

context.timerService().deleteEventTimeTimer(timerTs)

Collector. Collect (orderresult (context. Getcurrentkey, “payment succeeded”))

}Else {/ / if no payment has been made, register a timer and wait for 15 minutes to trigger

context.timerService().registerEventTimeTimer(i.ts)

timerTsState.update(i.ts 1000L + 9001000L)

isCreateOrderState.update(true)

}

}

Else if (i.eventtype = = pay “) {/ / if the current event is a payment event

If (iscreate) {/ / the order creation event has occurred

If (i.ts * 1000L < timerts) {/ / the time from order creation to payment is within the timeout, which means normal payment

Collector. Collect (orderresult (context. Getcurrentkey, “payment succeeded”))

}else{

Context. Output (ordertimeoutputtag, orderresult (context. Getcurrentkey, “paid, but no order found, timeout”))

}

isCreateOrderState.clear()

isPayedOrderState.clear()

timerTsState.clear()

context.timerService().deleteEventTimeTimer(timerTs)

}If you don’t see the order creation event, register a timer to wait

context.timerService().registerEventTimeTimer(i.ts)

isPayedOrderState.update(true)

timerTsState.update(i.ts)

}

}

 }
`}`

The above implementation of the monitoring of user payment, in fact, also need to pay the bill with the third-party payment platform to do a real-time reconciliation function

There are twoSource code transactionConfluence calculation of data flow (payment and billing)

The bill is simulated here, so you need to prepare a data receiptlog. CSV

akdb3833,alipay,1611047619
`akdb3832,wechat,1611049617`

Upper Code: Src / main / Scala / COM / Mafei / orderpaymonitor / txmatch.scala

/*
 *
 * @author mafei
 * @date 2021/1/31
*/
package com.mafei.orderPayMonitor
import com.mafei.orderPayMonitor.OrderTimeoutMonitor.getClass
import org.apache.flink.api.common.state.{ValueState, ValueStateDescriptor}
import org.apache.flink.streaming.api.TimeCharacteristic
import org.apache.flink.streaming.api.functions.co.CoProcessFunction
import org.apache.flink.streaming.api.scala.{OutputTag, StreamExecutionEnvironment, createTypeInformation}
import org.apache.flink.util.Collector
case class ReceiptEvent(orderId: String, payChannel:String, ts: Long)
object TxMatch {
 def main(args: Array[String]): Unit = {

val env = StreamExecutionEnvironment.getExecutionEnvironment

env.setParallelism(1)

env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)

//1. Read the data from the order file

val resource = getClass.getResource(“/OrderLog.csv”)

val orderEventStream = env.readTextFile(resource.getPath)

.map(d=>{

val arr = d.split(“,”)

Orderevent (arr (0). Tolong, arr (1), arr (2), arr (3). Tolong) / / read the data and convert it to the desired sample class type

}).assignAscendingTimestamps(_. TS * 1000L) / / specify the TS field

.filter(_.eventType==”pay”)

.keyBy(_. Txid) / / grouped by transaction ID

//2. Read data from the bill

val receiptResource = getClass.getResource(“/ReceiptLog.csv”)

val receiptEventStream = env.readTextFile(receiptResource.getPath)

.map(d=>{

val arr = d.split(“,”)

Receiptevent (arr (0), arr (1), arr (2). Tolong) / / read out the data and convert it to the desired sample class type

}).assignAscendingTimestamps(_. TS * 1000L) / / specify the TS field

.keyBy(_. OrderID) / / group by order ID

//3. Merge the two streams for processing

val resultStream = orderEventStream.connect(receiptEventStream)

.process(new TxPayMatchResult())

resultStream.print(“match: “)

resultStream.getSideOutput(new OutputTag[OrderEvent]).print(“unmatched-pay”)

resultStream.getSideOutput(new OutputTag[ReceiptEvent]).print(“unmatched-receipt”)

env.execute()

 }
}
class TxPayMatchResult() extends CoProcessFunction[OrderEvent,ReceiptEvent,(OrderEvent,)]{
 `lazy val orderEventState: ValueState[`OrderEvent`] = getRuntimeContext.getState(new ValueStateDescriptor[`OrderEvent`]
 `lazy val receiptEventState: ValueState[`ReceiptEvent`] = getRuntimeContext.getState(new ValueStateDescriptor[`ReceiptEvent`](`"payEvent", classOf[ReceiptEvent]`))`
// 
Define custom side output stream
 `val unmatchedOrderEventTag = new OutputTag[`OrderEvent`](`"unmatched-pay"`)`
 `val unmatchedReceiptEventTag = new OutputTag[`ReceiptEvent`](`"receipt"`)`
 override def processElement1(in1: OrderEvent, context: CoProcessFunction[OrderEvent, ReceiptEvent, (OrderEvent, ReceiptEvent)]#Context, collector: Collector[(OrderEvent, ReceiptEvent)]): Unit = {

//Here comes the bill

val receiptEvent = receiptEventState.value()

if(receiptEvent != null){

//If the bill has come, it will be output directly

collector.collect((in1,receiptEvent))

orderEventState.clear()

receiptEventState.clear()

}else{

//If not, register a timer and wait for 10 seconds

context.timerService().registerEventTimeTimer(in1.ts*1000L + 10000L)

orderEventState.update(in1)

}

 }
 override def processElement2(in2: ReceiptEvent, context: CoProcessFunction[OrderEvent, ReceiptEvent, (OrderEvent, ReceiptEvent)]#Context, collector: Collector[(OrderEvent, ReceiptEvent)]): Unit = {

//Here comes the payment event

val orderEvent = orderEventState.value()

if(orderEvent != null){

//If the bill has come, it will be output directly

collector.collect((orderEvent,in2))

orderEventState.clear()

receiptEventState.clear()

}else{

//If not, register a timer and wait for 2 seconds

context.timerService().registerEventTimeTimer(in2.ts*1000L + 2000L)

receiptEventState.update(in2)

}

 }
 override def onTimer(timestamp: Long, ctx: CoProcessFunction[OrderEvent, ReceiptEvent, (OrderEvent, ReceiptEvent)]#OnTimerContext, out: Collector[(OrderEvent, ReceiptEvent)]): Unit = {

if(orderEventState.value() != null){

ctx.output(unmatchedOrderEventTag, orderEventState.value())

}

else if(receiptEventState.value() != null){

ctx.output(unmatchedReceiptEventTag, receiptEventState.value())

}

orderEventState.clear()

receiptEventState.clear()

 }
`}`

Second, use join to achieve this effect
The advantage of this method is that it is more convenient and has a layer of encapsulation. The disadvantage is also obvious. If you want to implement some complex situations, such as output without matching, it will not work. See the actual needs of the scene

/*

*

* @author mafei

* @date 2021/1/31

*/

package com.mafei.orderPayMonitor
import com.mafei.orderPayMonitor.TxMatch.getClass
import org.apache.flink.streaming.api.TimeCharacteristic
import org.apache.flink.streaming.api.functions.co.ProcessJoinFunction
import org.apache.flink.streaming.api.scala.{StreamExecutionEnvironment, createTypeInformation}
import org.apache.flink.streaming.api.windowing.time.Time
import org.apache.flink.util.Collector
object TxMatchWithJoin {
 def main(args: Array[String]): Unit = {
 val env = StreamExecutionEnvironment.getExecutionEnvironment
 `env.setParallelism(`1`)`
 env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
_//  1__、 Read data from order file_
 `val resource = getClass.getResource(`"/OrderLog.csv"`)`
 val orderEventStream = env.readTextFile(resource.getPath)
 .map(d=>{
 `val arr = d.split(`","`)`
`OrderEvent(arr(`0`).toLong,arr(`1`),arr(`2`), arr(`3`).toLong)` _//__ Read out the data and convert it to the desired sample class type_
`}).assignAscendingTimestamps(_. ts *` 1000`L)` _//__ Specify the TS field_
 `.filter(_.eventType==`"pay"`)`
`.keyBy(_. txId)` _//__ Group by transaction ID_
_//  2__、 Reading data from bills_
 `val receiptResource = getClass.getResource(`"/ReceiptLog.csv"`)`
 val receiptEventStream = env.readTextFile(receiptResource.getPath)
 .map(d=>{
 `val arr = d.split(`","`)`
`ReceiptEvent(arr(`0`),arr(`1`),arr(`2`).toLong)` _//__ Read out the data and convert it to the desired sample class type_
`}).assignAscendingTimestamps(_. ts *` 1000`L)` _//__ Specify the TS field_
`.keyBy(_. orderId)` _//__ Group by order ID_
 val resultStream = orderEventStream.intervalJoin(receiptEventStream)
 `.between(Time.seconds(`-3`), Time.seconds(`5`))`
 .process(new TxMatchWithJoinResult())
 resultStream.print()
 env.execute()
 }
}

class TxMatchWithJoinResult() extends ProcessJoinFunction[OrderEvent, ReceiptEvent,(OrderEvent,ReceiptEvent)]{

 `override def processElement(in1: OrderEvent, in2: ReceiptEvent, context: ProcessJoinFunction[OrderEvent, ReceiptEvent, (OrderEvent, ReceiptEvent)]`**#Context, collector: Collector[(OrderEvent, ReceiptEvent)]): Unit = {**
 collector.collect((in1,in2))
 }
`}`